Tables
The Table class provides a tabular data structure for structured data processing. Tables are the primary data type returned by read() when loading CSV, JSON arrays, and other tabular formats.
For reading and writing table data, see the I/O guide. For data pipelines and the data {} block, see Tables & Data.
Standalone Functions
Table operations are standalone, pipe-friendly functions: tableWhere(), tableSelect(), tableDerive(), etc. Use the pipe operator |> to chain them, or call them directly with the table as the first argument. See the Data Tutorials for hands-on walkthroughs.
Constructor
Table
Table(rows, columns?) -> TableCreates a table from an array of row objects. Optionally specify column names.
t = Table([
{ name: "Alice", age: 30, city: "NYC" },
{ name: "Bob", age: 25, city: "LA" },
{ name: "Carol", age: 35, city: "NYC" }
])Properties
rows
t.rows -> IntThe number of rows in the table.
length
t.length -> IntAlias for rows. The number of rows in the table.
columns
t.columns -> [String]The column names in order.
shape
t.shape -> [Int, Int]Returns [row_count, column_count].
t.shape // [3, 3]Access
toArray
t.toArray() -> [Object]Returns the underlying rows as an array of objects.
at
t.at(index) -> Object | NilReturns the row at the given index. Supports negative indices.
first = t.at(0) // first row
last = t.at(-1) // last rowslice
t.slice(start, end) -> TableReturns a new table with rows from start to end.
getColumn
t.getColumn(name) -> [Any]Returns all values of a column as an array.
names = t.getColumn("name") // ["Alice", "Bob", "Carol"]Query
where
t.where(predicate) -> TableFilters rows matching a predicate. Use .column shorthand for column access.
adults = t.where(.age >= 18)
nyc = t.where(.city == "NYC")select
t.select(...columns) -> TableReturns a table with only the specified columns.
names = t.select("name", "age")derive
t.derive(column = expr) -> TableAdds or replaces a column with a computed value.
t.derive(.name_upper = .name |> upper())
t.derive(.age_group = if .age >= 30 { "senior" } else { "junior" })sort_by
t.sortBy(column, desc?) -> TableSorts the table by a column. Optionally in descending order.
t.sortBy(.age)
t.sortBy(.age, desc: true)limit
t.limit(n) -> TableReturns the first n rows.
top_5 = t.sortBy(.age, desc: true).limit(5)Grouping and Aggregation
group_by
t.groupBy(column) -> GroupedTableGroups rows by a column value. Must be followed by agg().
t.groupBy(.city)agg
grouped.agg(name: agg_fn, ...) -> TableAggregates grouped data. Inside agg(), use these aggregation functions (the compiler automatically transforms them to their agg_* runtime equivalents):
| Function | Description |
|---|---|
sum(.column) | Sum of column values |
count() | Number of rows in each group |
mean(.column) | Mean (average) of column values |
median(.column) | Median of column values |
min(.column) | Minimum column value |
max(.column) | Maximum column value |
summary = t
|> groupBy(.city)
|> agg(
count: count(),
avg_age: mean(.age),
oldest: max(.age)
)Joins
join
t.join(other, opts) -> TableJoins two tables on matching columns.
Options:
left/right-- column names or accessors to join onhow-- join type:"inner"(default),"left","right","outer","cross","anti","semi"
users = Table([
{ id: 1, name: "Alice" },
{ id: 2, name: "Bob" }
])
orders = Table([
{ user_id: 1, amount: 50 },
{ user_id: 1, amount: 30 },
{ user_id: 2, amount: 70 }
])
// Inner join (default)
result = users |> join(orders, left: "id", right: "user_id")
// Left join — keeps all rows from the left table
left_result = users |> join(orders, left: "id", right: "user_id", how: "left")// All join types
a |> join(b, left: .id, right: .uid, how: "right") // all right rows
a |> join(b, left: .id, right: .uid, how: "outer") // all rows from both
a |> join(b, how: "cross") // cartesian product
a |> join(b, left: .id, right: .uid, how: "anti") // left rows with no match
a |> join(b, left: .id, right: .uid, how: "semi") // left rows with a matchReshaping
pivot
t |> pivot(index: column, columns: column, values: column) -> TablePivots rows into columns. Takes named arguments index, columns, and values.
sales = Table([
{ region: "East", quarter: "Q1", revenue: 100 },
{ region: "East", quarter: "Q2", revenue: 150 },
{ region: "West", quarter: "Q1", revenue: 200 }
])
sales |> pivot(index: "region", columns: "quarter", values: "revenue")
// { _index: "East", Q1: 100, Q2: 150 }
// { _index: "West", Q1: 200, Q2: nil }Note: The index column is renamed to _index in the output.
unpivot
t |> unpivot(id: column, columns: [String]) -> TableConverts columns into rows (the inverse of pivot). Takes named arguments id (the identifier column) and columns (column names to unpivot). Returns a table with id, variable, and value columns.
explode
t.explode(column) -> TableExpands array values in a column into separate rows.
t = Table([
{ name: "Alice", tags: ["admin", "user"] },
{ name: "Bob", tags: ["user"] }
])
t.explode("tags")
// { name: "Alice", tags: "admin" }
// { name: "Alice", tags: "user" }
// { name: "Bob", tags: "user" }Cleaning
drop_duplicates
t |> dropDuplicates(by?: column) -> TableRemoves duplicate rows. Optionally specify a column to check for uniqueness via the by option. Without by, compares entire rows.
t |> dropDuplicates()
t |> dropDuplicates(by: .email)rename
t.rename(old, new) -> TableRenames a column.
t.rename("name", "full_name")cast
t.cast(column, type) -> TableConverts a column to a different type.
t.cast("age", "Int")
t.cast("price", "Float")drop_nil
t |> dropNil(column) -> TableRemoves rows where the specified column is nil.
t |> dropNil(.email)fill_nil
t.fillNil(column, value) -> TableReplaces nil values in a column with a default value.
t.fillNil(.score, 0)
t.fillNil(.status, "unknown")Inspection
peek
t |> peek(n?: Int, title?: String) -> TablePrints a preview of the table and returns it (passthrough for pipelines). Shows the first n rows (default 10).
data |> peek()
data |> peek(n: 5, title: "Sample data")describe
t.describe() -> TableReturns summary statistics for numeric columns (count, mean, std, min, max, quartiles).
t.describe() |> peek()schema_of
schemaOf(table) -> ObjectReturns the inferred schema (column names and types) of a table.
schemaOf(users)
// { name: "String", age: "Int", email: "String" }Window Functions
window
t |> window(partition_by?: column, order_by?: column, desc?: Bool, ...fns) -> TableComputes values across partitions of rows without collapsing them. Returns a new Table with all original columns plus new window columns.
Options:
partition_by-- column to partition rows by (optional; without it, entire table is one partition)order_by-- column to sort rows within each partition (optional)desc-- sort descending within partitions (defaultfalse)
All other named arguments define new columns using window functions.
result = employees |> window(
partition_by: .dept,
order_by: .salary,
rn: row_number(),
rnk: rank(),
prev_salary: lag(.salary, 1),
running_total: running_sum(.salary)
)Ranking Functions
| Function | Description |
|---|---|
row_number() | Sequential number in partition (1, 2, 3, ...) |
rank() | Rank with gaps for ties (1, 2, 2, 4) |
dense_rank() | Rank without gaps (1, 2, 2, 3) |
percent_rank() | Relative rank as fraction (0.0 to 1.0) |
ntile(n) | Divide into n equal-sized buckets |
Offset Functions
| Function | Description |
|---|---|
lag(.col, offset?, default?) | Value from a previous row (default offset=1) |
lead(.col, offset?, default?) | Value from a following row (default offset=1) |
first_value(.col) | First value in the partition |
last_value(.col) | Last value in the partition |
Running Aggregates
| Function | Description |
|---|---|
running_sum(.col) | Cumulative sum |
running_count() | Cumulative count |
running_avg(.col) | Cumulative average |
running_min(.col) | Running minimum |
running_max(.col) | Running maximum |
moving_avg(.col, n) | Moving average over last n rows |
// Ranking within departments
employees |> window(
partition_by: .dept,
order_by: .salary,
desc: true,
salary_rank: row_number()
)
// Running totals and moving averages
sales |> window(
order_by: .date,
cumulative: running_sum(.revenue),
trend: moving_avg(.revenue, 7)
)
// Previous/next row comparison
prices |> window(
order_by: .date,
prev_price: lag(.price),
next_price: lead(.price)
)Sampling
sample
t |> sample(n, seed?) -> TableReturns a random subset of rows. If n < 1, treats it as a fraction. Optional seed for reproducibility.
subset = t |> sample(100) // 100 random rows
subset = t |> sample(0.1) // 10% of rows
subset = t |> sample(1000, seed: 42) // reproduciblestratified_sample
t |> stratified_sample(key, n, seed?) -> TableSamples n rows (or fraction) from each group defined by key.
balanced = t |> stratified_sample(.category, 50)
balanced = t |> stratified_sample(.region, 0.1, seed: 42)Visualization
All chart functions take a Table (or array of objects) and return an SVG string.
bar_chart
barChart(data, x:, y:, title?, width?, height?, color?, labels?, sort?) -> StringVertical bar chart. Options: color (hex), labels: true (show values), sort: "desc".
line_chart
lineChart(data, x:, y:, title?, width?, height?, color?, points?) -> StringLine chart. Supports multi-series via array of y columns: y: [.revenue, .cost].
scatter_chart
scatterChart(data, x:, y:, title?, width?, height?, color?) -> StringScatter plot with <circle> elements.
histogram
histogram(data, col:, bins?, title?, width?, height?, color?) -> StringHistogram with automatic binning. Default 20 bins.
pie_chart
pieChart(data, label:, value:, title?, width?, height?) -> StringPie chart with percentage labels.
heatmap
heatmap(data, x:, y:, value:, title?, width?, height?) -> StringGrid of colored cells. Requires categorical x/y and numeric value.
// Example: generate and save a chart
sales
|> groupBy(.region)
|> agg(revenue: sum(.amount))
|> barChart(x: .region, y: .revenue, title: "Revenue")
|> writeText("chart.svg")Combination
union
union(a, b) -> TableCombines two tables with the same columns, appending all rows.
all_users = union(active_users, inactive_users)Pipeline Example
result = read("sales.csv")
|> dropNil(.amount)
|> fillNil(.region, "Unknown")
|> where(.amount > 0)
|> derive(.quarter = dateFormat(.date, "QQ YYYY"))
|> groupBy(.region)
|> agg(
total: sum(.amount),
count: count(),
avg: mean(.amount)
)
|> sortBy(.total, desc: true)
|> peek(title: "Revenue by Region")
result |> write("revenue_summary.csv")