observable-framework-lib-arquero
Library: Arquero
Observable Framework documentation: Library: Arquero Source: https://observablehq.com/framework/lib-arquero
Arquero
Arquero is a JavaScript library for “query processing and transformation of array-backed data tables.” Arquero is available by default as aq in Markdown, but you can import it explicitly like so:
import * as aq from "npm:arquero";
Following the documentation website’s introduction, let’s create a table of the Average hours of sunshine per month, from usclimatedata.com.
const dt = aq.table({
"Seattle": [69, 108, 178, 207, 253, 268, 312, 281, 221, 142, 72, 52],
"Chicago": [135, 136, 187, 215, 281, 311, 318, 283, 226, 193, 113, 106],
"San Francisco": [165, 182, 251, 281, 314, 330, 300, 272, 267, 243, 189, 156]
});
Arquero is column-oriented: each column is an array of values of a given type. Here, numbers representing hours of sunshine per month. But an Arquero table is also iterable and as such, its contents can be displayed with Inputs.table.
Inputs.table(dt)
An Arquero table can also be used to make charts with Observable Plot:
Plot.plot({
x: {tickFormat: Plot.formatMonth()},
y: {grid: true, label: "Hours of sunshine ☀️ per month"},
marks: [
Plot.ruleY([0]),
Plot.lineY(dt, {y: "Seattle", marker: true, stroke: "red"}),
Plot.lineY(dt, {y: "Chicago", marker: true, stroke: "turquoise"}),
Plot.lineY(dt, {y: "San Francisco", marker: true, stroke: "orange"})
]
})
Arquero supports a range of data transformation tasks, including filter, sample, aggregation, window, join, and reshaping operations. For example, the following operation derives differences between Seattle and Chicago and sorts the months accordingly.
Inputs.table(
dt.derive({
month: (d) => aq.op.row_number(),
diff: (d) => d.Seattle - d.Chicago
})
.select("month", "diff")
.orderby(aq.desc("diff"))
)
Is Seattle more correlated with San Francisco or Chicago?
Inputs.table(
dt.rollup({
corr_sf: aq.op.corr("Seattle", "San Francisco"),
corr_chi: aq.op.corr("Seattle", "Chicago")
})
)
We can aggregate statistics per city. The following code reshapes (or “folds”) the data into two columns city & sun and shows the output as objects:
dt.fold(aq.all(), {as: ["city", "sun"]})
.groupby("city")
.rollup({
min: aq.op.min("sun"),
max: aq.op.max("sun"),
avg: (d) => aq.op.average(d.sun), // equivalent to aq.op.average("sun")
med: (d) => aq.op.median(d.sun), // equivalent to aq.op.median("sun")
skew: ({sun}) => (aq.op.mean(sun) - aq.op.median(sun)) / aq.op.stdev(sun)
})
.objects()
To load an Arquero table from an Apache Arrow, Apache Parquet, CSV, TSV, or JSON file, use file.arquero :
const flights = FileAttachment("flights-200k.arrow").arquero();
This is equivalent to:
const flights = aq.loadArrow(FileAttachment("flights-200k.arrow").href);
For more, see Arquero’s official documentation.
More from spqw/skill-observable-framework
observable-framework-lib-deckgl
Using Deck.gl in Observable Framework for large-scale geospatial data visualization.
10observable-framework-lib-mapbox-gl
Using Mapbox GL JS in Observable Framework for vector tile maps.
4observable-framework-jsx
Using JSX (React components) in Observable Framework pages.
3observable-framework-files
Working with files in Observable Framework — file attachments, file-based routing, and FileAttachment API.
3observable-framework-input-range
Range slider input component in Observable Framework — select a numeric value.
3observable-framework-lib-tex
Using TeX/LaTeX in Observable Framework for mathematical typesetting.
3