-
Notifications
You must be signed in to change notification settings - Fork 56
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Exploiting segments
in a way that scales
#558
Comments
This is a creative idea and my personal thought is that such scaling should be possible (albeit not immediately obvious or convenient) without needing to bake additional behaviors into Given a larger "global" data frame and an agent that can apply over that data frame, you could start by creating a list of data splits by segment yourself, and then iterate over those data splits to produce segment-level agents, by re-interrogating the global agent over the data split using the An example: library(pointblank)
# ---- The familiar interface
# Data frame with segments
set.seed(1)
df <- data.frame(segment = forcats::fct_inorder(month.name[1:3]), val = rnorm(300))
# Global agent
agent <- df %>%
create_agent() %>%
col_vals_gt(val, 0)
# ---- Data split strategy
# Split data by segment to iterate over
df_split <- split(df, ~ segment)
# Re-interrogate the `agent` on each data split
segment_agents <- lapply(df_split, function(x) {
agent %>%
set_tbl(x, label = unique(x$segment)) %>%
interrogate()
})
# Grab segment-level agent reports
segment_agent_reports <- lapply(segment_agents, get_agent_report)
# A simple rendering of reports inside a single div
do.call(htmltools::div, unname(segment_agent_reports)) %>%
print(browse = TRUE) |
Yes, I've been doing something similar to this, although your method is probably more elegant! Perhaps then it's not worth complicating the functions further. |
Related to #451, the use of the
segments
parameter is a powerful way of grouping and multiplying your validation tests. Whilst having a custom label is useful to be able to see the segments at a glance, this could cause very large validation reports that are difficult to navigate and parse.As a starting point, it would be useful to set a global segmentation scheme in
create_agent()
(much likeactions
), which will then apply to every validation function. Perhaps a way of overriding this for specific validation checks (e.g. setting to NULL) would be needed too.When it comes to organising this in the report, would it be possible to split the HTML output into sections for each segment (or even better, tabs)? My use case is monthly reports spanning years, and I want to be able to see the issues within specific months.
I have a feeling I am just scratching the surface of what could be done here, and would be keen to hear others' thoughts.
The text was updated successfully, but these errors were encountered: