Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: thread shared metadata file into rover to enable use of alarm #390

Merged
merged 11 commits into from
Jul 24, 2023
2 changes: 1 addition & 1 deletion .github/workflows/clippy.yml
Original file line number Diff line number Diff line change
Expand Up @@ -9,7 +9,7 @@ jobs:
clippy_check:
runs-on: ubuntu-latest
steps:
- uses: dtolnay/rust-toolchain@1.66.1
- uses: dtolnay/rust-toolchain@1.70.0
with:
components: clippy
- uses: actions/checkout@v3
Expand Down
2 changes: 1 addition & 1 deletion .github/workflows/pages.yml
Original file line number Diff line number Diff line change
Expand Up @@ -15,7 +15,7 @@ jobs:
- name: Checkout
uses: actions/checkout@v3

- uses: dtolnay/rust-toolchain@1.66.0
- uses: dtolnay/rust-toolchain@1.70.0

- name: Cargo doc
run: |
Expand Down
2 changes: 1 addition & 1 deletion .github/workflows/rust.yml
Original file line number Diff line number Diff line change
Expand Up @@ -15,7 +15,7 @@ jobs:
runs-on: ubuntu-latest

steps:
- uses: dtolnay/rust-toolchain@1.66.0
- uses: dtolnay/rust-toolchain@1.70.0
with:
components: rustfmt
- uses: actions/checkout@v3
Expand Down
2 changes: 1 addition & 1 deletion martian-derive/src/lib.rs
Original file line number Diff line number Diff line change
Expand Up @@ -152,7 +152,7 @@
// struct. We can build the `stage_name()` function from the struct name.
// TODO: This might also be a good time to check that the implementation is
// actually for a unit struct. But is it possible?
let stage_struct = item_impl.self_ty.clone();
let stage_struct = item_impl.self_ty;
let stage_struct_name = match *stage_struct {
Type::Path(ref ty_path) => {
let segments = &ty_path.path.segments;
Expand Down Expand Up @@ -322,10 +322,10 @@

Err(Error::new(
span,
format!(
"{}. You are trying to use it on {} trait implementation.",
ATTR_NOT_ON_TRAIT_IMPL_ERROR, last_ident
),

Check warning on line 328 in martian-derive/src/lib.rs

View workflow job for this annotation

GitHub Actions / clippy_check

variables can be used directly in the `format!` string

warning: variables can be used directly in the `format!` string --> martian-derive/src/lib.rs:325:9 | 325 | / format!( 326 | | "{}. You are trying to use it on {} trait implementation.", 327 | | ATTR_NOT_ON_TRAIT_IMPL_ERROR, last_ident 328 | | ), | |_________^ | = help: for further information visit https://rust-lang.github.io/rust-clippy/master/index.html#uninlined_format_args = note: requested on the command line with `-W clippy::uninlined-format-args`
))
}

Expand Down Expand Up @@ -485,10 +485,10 @@
Err(_) => {
return syn::Error::new_spanned(
field,
format!(
"{}. Found {}",
INVALID_MRO_TYPE_ERROR, val_string
),

Check warning on line 491 in martian-derive/src/lib.rs

View workflow job for this annotation

GitHub Actions / clippy_check

variables can be used directly in the `format!` string

warning: variables can be used directly in the `format!` string --> martian-derive/src/lib.rs:488:45 | 488 | / ... format!( 489 | | ... "{}. Found {}", 490 | | ... INVALID_MRO_TYPE_ERROR, val_string 491 | | ... ), | |_______________________^ | = help: for further information visit https://rust-lang.github.io/rust-clippy/master/index.html#uninlined_format_args
)
.to_compile_error()
.into();
Expand Down Expand Up @@ -517,10 +517,10 @@
if blacklist.contains(name.as_str()) {
return syn::Error::new(
field.ident.unwrap().span(),
format!(
"Field name {} is not allowed here since it is a martian keyword",
name
),

Check warning on line 523 in martian-derive/src/lib.rs

View workflow job for this annotation

GitHub Actions / clippy_check

variables can be used directly in the `format!` string

warning: variables can be used directly in the `format!` string --> martian-derive/src/lib.rs:520:17 | 520 | / format!( 521 | | "Field name {} is not allowed here since it is a martian keyword", 522 | | name 523 | | ), | |_________________^ | = help: for further information visit https://rust-lang.github.io/rust-clippy/master/index.html#uninlined_format_args
)
.to_compile_error()
.into();
Expand Down
2 changes: 1 addition & 1 deletion martian-filetypes/src/bin_file.rs
Original file line number Diff line number Diff line change
Expand Up @@ -437,7 +437,7 @@ mod tests {

#[test]
fn test_lazy_read() -> Result<(), Error> {
let values: Vec<u16> = (0..100).into_iter().collect();
let values: Vec<u16> = (0..100).collect();
let bin_file = BincodeFile::tempfile()?;
bin_file.write(&values)?;

Expand Down
4 changes: 2 additions & 2 deletions martian-filetypes/tests/ui/invalid_type_read.stderr
Original file line number Diff line number Diff line change
Expand Up @@ -2,14 +2,14 @@ error[E0308]: `?` operator has incompatible types
--> tests/ui/invalid_type_read.rs:27:37
|
27 | let new_feature: Creature = feat_file.read()?; // Compiler error
| ^^^^^^^^^^^^^^^^^ expected struct `Creature`, found struct `Feature`
| ^^^^^^^^^^^^^^^^^ expected `Creature`, found `Feature`
|
= note: `?` operator cannot convert from `Feature` to `Creature`

error[E0308]: `?` operator has incompatible types
--> tests/ui/invalid_type_read.rs:34:37
|
34 | let new_feature: Creature = feat_file.read()?; // Compiler error
| ^^^^^^^^^^^^^^^^^ expected struct `Creature`, found struct `Feature`
| ^^^^^^^^^^^^^^^^^ expected `Creature`, found `Feature`
|
= note: `?` operator cannot convert from `Feature` to `Creature`
12 changes: 6 additions & 6 deletions martian-filetypes/tests/ui/invalid_type_write.stderr
Original file line number Diff line number Diff line change
Expand Up @@ -2,13 +2,13 @@ error[E0308]: mismatched types
--> tests/ui/invalid_type_write.rs:26:25
|
26 | feat_file.write(&creature)?; // This is a compiler error
| ----- ^^^^^^^^^ expected struct `Feature`, found struct `Creature`
| ----- ^^^^^^^^^ expected `&Feature`, found `&Creature`
| |
| arguments to this function are incorrect
| arguments to this method are incorrect
|
= note: expected reference `&Feature`
found reference `&Creature`
note: associated function defined here
note: method defined here
--> src/lib.rs
|
| fn write(&self, item: &T) -> Result<(), Error> {
Expand All @@ -18,13 +18,13 @@ error[E0308]: mismatched types
--> tests/ui/invalid_type_write.rs:31:25
|
31 | feat_file.write(&creature)?; // This is a compiler error
| ----- ^^^^^^^^^ expected struct `Feature`, found struct `Creature`
| ----- ^^^^^^^^^ expected `&Feature`, found `&Creature`
| |
| arguments to this function are incorrect
| arguments to this method are incorrect
|
= note: expected reference `&Feature`
found reference `&Creature`
note: associated function defined here
note: method defined here
--> src/lib.rs
|
| fn write(&self, item: &T) -> Result<(), Error> {
Expand Down
9 changes: 4 additions & 5 deletions martian/src/lib.rs
Original file line number Diff line number Diff line change
Expand Up @@ -9,12 +9,14 @@
use anyhow::{format_err, Context};
use backtrace::Backtrace;
use log::{error, info};

use std::collections::HashMap;
use std::fmt::Write as FmtWrite;
use std::fs::File;
use std::io::Write as IoWrite;
use std::os::unix::io::{FromRawFd, IntoRawFd};
use std::path::Path;

use std::{io, panic};
use time::format_description::modifier::{Day, Hour, Minute, Month, Second, Year};
use time::format_description::FormatItem::Literal;
Expand Down Expand Up @@ -182,11 +184,9 @@
let log_file: File = unsafe { File::from_raw_fd(3) };
setup_logging(log_file, level);

// setup Martian metadata (and an extra copy for use in the panic handler
let _md = initialize(args).context("IO Error initializing stage");

// setup Martian metadata
// special handler for error in stage setup
let mut md = match _md {
let mut md = match initialize(args).context("IO Error initializing stage") {
Ok(m) => m,
Err(e) => {
let _ = write_errors(&format!("{e:?}"), false);
Expand Down Expand Up @@ -357,14 +357,13 @@
mro_string
)
} else {
assert!(
header_comment
.lines()
.into_iter()
.all(|line| line.trim_end().is_empty() || line.starts_with('#')),
"All non-empty header lines must start with '#', but got\n{}",
header_comment
);

Check warning on line 366 in martian/src/lib.rs

View workflow job for this annotation

GitHub Actions / clippy_check

variables can be used directly in the `format!` string

warning: variables can be used directly in the `format!` string --> martian/src/lib.rs:360:9 | 360 | / assert!( 361 | | header_comment 362 | | .lines() 363 | | .all(|line| line.trim_end().is_empty() || line.starts_with('#')), 364 | | "All non-empty header lines must start with '#', but got\n{}", 365 | | header_comment 366 | | ); | |_________^ | = help: for further information visit https://rust-lang.github.io/rust-clippy/master/index.html#uninlined_format_args
format!(
"{}
#
Expand Down
122 changes: 56 additions & 66 deletions martian/src/metadata.rs
Original file line number Diff line number Diff line change
Expand Up @@ -5,11 +5,11 @@ use serde_json::map::Map;
use serde_json::{self, Value};
use std::any::type_name;
use std::borrow::Cow;
use std::collections::HashSet;
use std::fs::{rename, File, OpenOptions};
use std::io::{ErrorKind, Write};
use std::os::unix::io::{FromRawFd, IntoRawFd};
use std::os::unix::io::FromRawFd;
use std::path::{Path, PathBuf};
use std::sync::{Arc, Mutex};
use std::time::SystemTime;
use time::{OffsetDateTime, UtcOffset};

Expand All @@ -28,7 +28,8 @@ pub struct Metadata {
run_file: String,
raw_jobinfo: JsonDict,
pub jobinfo: JobInfo, // Partially parsed Job info
cache: HashSet<String>,
/// Shared reference to the alarm file.
alarm_file: SharedFile,
}

#[derive(Debug, Default, Clone, Serialize, Deserialize)]
Expand Down Expand Up @@ -58,10 +59,11 @@ impl Default for Version {
}
}

#[derive(Debug, Clone, Serialize, Deserialize)]
#[derive(Default, Debug, Clone, Serialize, Deserialize)]
#[serde(rename_all = "snake_case")]
#[non_exhaustive]
pub enum ProfileMode {
#[default]
Disable,
Cpu,
Line,
Expand All @@ -70,12 +72,6 @@ pub enum ProfileMode {
Pyspy,
}

impl Default for ProfileMode {
fn default() -> ProfileMode {
ProfileMode::Disable
}
}

// Stuff that will be added to the _jobinfo under the "rust" key
#[derive(Debug, Serialize)]
struct RustAdapterInfo {
Expand Down Expand Up @@ -150,23 +146,23 @@ impl Metadata {
let metadata_path = args.pop().unwrap();
let stage_type = args.pop().unwrap();
let stage_name = args.pop().unwrap();
let alarm_file = SharedFile::new(make_metadata_file_path(metadata_path.as_ref(), "alarm"));

Metadata {
stage_name,
stage_type,
metadata_path,
files_path,
run_file,
cache: HashSet::new(),
raw_jobinfo: Map::new(),
jobinfo: JobInfo::default(),
jobinfo: Default::default(),
alarm_file,
}
}

/// Path within chunk
pub fn make_path(&self, name: &str) -> PathBuf {
let md: &Path = self.metadata_path.as_ref();
md.join([METADATA_PREFIX, name].concat())
make_metadata_file_path(self.metadata_path.as_ref(), name)
}

/// Write to a file inside the chunk
Expand All @@ -182,39 +178,32 @@ impl Metadata {
}

/// Update the Martian journal -- so that Martian knows what we've updated
fn update_journal_main(&mut self, name: &str, force: bool) -> Result<()> {
fn update_journal(&self, name: &str) -> Result<()> {
let journal_name: Cow<str> = if self.stage_type != "main" {
format!("{}_{name}", self.stage_type).into()
} else {
name.into()
};

if force || !self.cache.contains(name) {
let tmp_run_file = format!("{}.{journal_name}.tmp", self.run_file);
let run_file = &tmp_run_file[..tmp_run_file.len() - 4];

{
let mut f = File::create(&tmp_run_file)?;
if let Err(err) = f.write_all(make_timestamp_now().as_bytes()) {
// Pretty much ignore this error. The only reason we need
// any content at all in this file is because some
// filesystems behave strangely with completely empty files.
eprintln!("Writing journal file {tmp_run_file}: {err}");
}
let tmp_run_file = format!("{}.{journal_name}.tmp", self.run_file);
let run_file = &tmp_run_file[..tmp_run_file.len() - 4];

{
let mut f = File::create(&tmp_run_file)?;
if let Err(err) = f.write_all(make_timestamp_now().as_bytes()) {
// Pretty much ignore this error. The only reason we need
// any content at all in this file is because some
// filesystems behave strangely with completely empty files.
eprintln!("Writing journal file {tmp_run_file}: {err}");
}
rename(tmp_run_file.as_str(), run_file).or_else(ignore_not_found)?;
self.cache.insert(journal_name.into_owned());
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not sure why we had this cache in the first place but it wasn't serving any real purpose. The only metadata files we might write to repeatedly are

  1. _log, which is handled by mrjob
  2. _progress, which this isn't exposing and for which we'd want to write the journal unconditionally every time.
  3. _alarms, for which mrp doesn't actually need the journal at all. And if you're writing enough alarms that journaling it repeatedly causes performance degradation, you have other problems to worry about.

}
rename(tmp_run_file.as_str(), run_file).or_else(ignore_not_found)?;

Ok(())
}

fn update_journal(&mut self, name: &str) -> Result<()> {
self.update_journal_main(name, false)
}

/// Write JSON to a chunk file
pub(crate) fn write_json_obj(&mut self, name: &str, object: &JsonDict) -> Result<()> {
pub(crate) fn write_json_obj(&self, name: &str, object: &JsonDict) -> Result<()> {
serde_json::to_writer_pretty(File::create(self.make_path(name))?, object)?;
self.update_journal(name)?;
Ok(())
Expand Down Expand Up @@ -271,40 +260,13 @@ impl Metadata {
Error::new(e).context(context)
}

fn _append(&mut self, name: &str, message: &str) -> Result<()> {
let filename = self.make_path(name);
let mut file = OpenOptions::new()
.create(true)
.append(true)
.open(filename)?;
file.write_all(message.as_bytes())?;
file.write_all(b"\n")?;
// Ensure the file is closed before we write the journal, to reduce
// the chances that `mrp` sees the journal entry before the file content
// has be sync'ed. This can be an issue on nfs systems.
drop(file);
self.update_journal(name)?;
Ok(())
pub(crate) fn alarm_file(&self) -> &SharedFile {
&self.alarm_file
}

/// Write to _log
pub fn log(&mut self, level: &str, message: &str) -> Result<()> {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For those not in our slack conversation, turns out this is unnecessary because we're already overriding the global-scope logging macros to write to the log. And we weren't using it. And we shouldn't, because this was buggy, as

  1. It didn't use write_all.
  2. if it panicked or if write or flush returned an error, it would close the file descriptor. Which means that if you continued on anyway and opened some other file, it would get fd 3, and then the next time you tried logging you'd log into whatever that file was.

let mut log_file = unsafe { File::from_raw_fd(3) };

log_file
.write(format!("{} [{level}] {message}", make_timestamp_now()).as_bytes())
.and(log_file.flush())?;

let _ = log_file.into_raw_fd();
Ok(())
}

pub fn log_time(&mut self, message: &str) -> Result<()> {
self.log("time", message)
}

pub fn alarm(&mut self, message: &str) -> Result<()> {
self._append("alarm", &format!("{} {message}", make_timestamp_now()))
/// Write a message to the stage alarms.
pub fn alarm(&self, message: &str) -> Result<()> {
self.alarm_file.appendln(message, true)
}

#[cold]
Expand Down Expand Up @@ -369,6 +331,34 @@ impl Metadata {
}
}

fn make_metadata_file_path(metadata_dir: &Path, name: &str) -> PathBuf {
metadata_dir.join([METADATA_PREFIX, name].concat())
}

/// Manage shared access to a metadata file.
#[derive(Debug, Clone)]
pub(crate) struct SharedFile(Arc<Mutex<PathBuf>>);

impl SharedFile {
pub fn new(path: PathBuf) -> Self {
Self(Arc::new(Mutex::new(path)))
}

/// Append the provided contents to the file.
/// Creates the file if it does not exist.
/// Appends a newline after contents.
/// Prepends a timestamp if requested.
pub fn appendln(&self, contents: &str, prepend_timestamp: bool) -> Result<()> {
let path = self.0.lock().unwrap();
let mut file = OpenOptions::new().create(true).append(true).open(&*path)?;
if prepend_timestamp {
write!(file, "{} ", make_timestamp_now())?;
}
writeln!(file, "{contents}")?;
Ok(())
}
}

#[cfg(test)]
mod tests {
use super::*;
Expand Down
9 changes: 2 additions & 7 deletions martian/src/mro.rs
Original file line number Diff line number Diff line change
Expand Up @@ -33,12 +33,12 @@
match field_width {
Some(width) => {
let min_width = self.min_width();
assert!(
width >= min_width,
"Need a minimum width of {}. Found {}",
min_width,
width
);

Check warning on line 41 in martian/src/mro.rs

View workflow job for this annotation

GitHub Actions / clippy_check

variables can be used directly in the `format!` string

warning: variables can be used directly in the `format!` string --> martian/src/mro.rs:36:17 | 36 | / assert!( 37 | | width >= min_width, 38 | | "Need a minimum width of {}. Found {}", 39 | | min_width, 40 | | width 41 | | ); | |_________________^ | = help: for further information visit https://rust-lang.github.io/rust-clippy/master/index.html#uninlined_format_args = note: requested on the command line with `-W clippy::uninlined-format-args`
self.mro_string_with_width(width)
}
None => self.mro_string_no_width(),
Expand Down Expand Up @@ -441,11 +441,11 @@
// Check that name does not match any martian token.
fn verify(&self) {
for &token in MARTIAN_TOKENS {
assert!(
self.name != token,
"Martian token {} cannot be used as field name",
token
);

Check warning on line 448 in martian/src/mro.rs

View workflow job for this annotation

GitHub Actions / clippy_check

variables can be used directly in the `format!` string

warning: variables can be used directly in the `format!` string --> martian/src/mro.rs:444:13 | 444 | / assert!( 445 | | self.name != token, 446 | | "Martian token {} cannot be used as field name", 447 | | token 448 | | ); | |_____________^ | = help: for further information visit https://rust-lang.github.io/rust-clippy/master/index.html#uninlined_format_args
}
assert!(!self.name.starts_with("__"));
}
Expand All @@ -469,18 +469,13 @@
}
}

#[derive(Debug, Clone, Copy, Serialize, Deserialize, PartialEq, Eq)]
#[derive(Debug, Clone, Copy, Serialize, Deserialize, PartialEq, Eq, Default)]
pub enum Volatile {
#[default]
Strict,
False,
}

impl Default for Volatile {
fn default() -> Self {
Volatile::Strict
}
}

impl From<&Volatile> for &'static str {
fn from(v: &Volatile) -> Self {
match v {
Expand Down Expand Up @@ -846,12 +841,12 @@

if let Some(ref chunk_in_out) = self.minified_chunk_in_outs() {
writeln!(f, ") split (")?;
write!(
f,
"{params:<ty_width$}",
params = chunk_in_out,
ty_width = ty_width
)?;

Check warning on line 849 in martian/src/mro.rs

View workflow job for this annotation

GitHub Actions / clippy_check

variables can be used directly in the `format!` string

warning: variables can be used directly in the `format!` string --> martian/src/mro.rs:844:13 | 844 | / write!( 845 | | f, 846 | | "{params:<ty_width$}", 847 | | params = chunk_in_out, 848 | | ty_width = ty_width 849 | | )?; | |_____________^ | = help: for further information visit https://rust-lang.github.io/rust-clippy/master/index.html#uninlined_format_args
}

if self.using_attrs.need_using() {
Expand Down
Loading