Apply series of string, date, math, currency & geocoding transformations to a CSV column. Also has basic NLP functions (similarity, sentiment analysis, etc.).
A slimmed-down version of apply with only Datapusher+ relevant subcommands/operations (qsvdp binary variant only).
Drop headers from a CSV.
Concatenate CSV files by row or by column.
Provide data as input from your clipboard or save output to your clipboard.
Count the rows in a CSV file.
Remove duplicate rows. (See also extdedup, extsort, sort & sortcheck commands).
Infer extended metadata about a CSV using a GPT model from OpenAI's API.
Find the difference between two CSVs.
Add a new column enumerating rows by adding a column of incremental or uuid identifiers. May also use to copy a column or fill a new column with a constant value.
Exports a specified Excel/ODS sheet to a CSV file.
Removes a set of CSV data from another set based on the specified columns.
Explode rows into multiple ones by splitting a column value based on the given separator.
Remove duplicate rows from an arbitrarily large CSV/text file using a memory-mapped, on-disk hash table. Unlike dedup, extdedup doesn't sort the deduped file.
Sort an arbitrarily large CSV/text file using a multithreaded external merge sort algorithm.
Fetches data from web services for each row w/ HTTP GET. Has HTTP/2 adaptive flow control, jql JSON query language support, dynamic throttling (RateLimit) & more.
Similar to fetch, but uses HTTP Post. (HTTP GET vs POST methods)
Force a CSV to have same-length records by either padding or truncating them.
A flattened view of CSV records. Useful for viewing one record at a time. e.g. qsv slice -i 5 data.csv | qsv flatten.
Reformat a CSV with different delimiters, record terminators or quoting rules. (Supports ASCII delimited data.)
Loop over a CSV to execute shell commands. (not available on Windows)
Build frequency tables of each column. Uses multithreading to go faster if an index is present.
Geocode CSV data using an updatable copy of the Geonames Cities database.
Show the headers of a CSV. Or show the intersection of all headers between many CSV files.
Create an index for a CSV with constant time indexing/random access. Multiple speed benefits & features when an index is present.
Read CSV data with special quoting, trimming, line-skipping & UTF-8 transcoding rules. Typically used to 'normalize' a CSV for processing with other commands.
Inner, outer, right, cross, anti & semi joins. Automatically creates a simple, in-memory hash index to make it faster.
Unlike join, can process files larger than RAM, multi-threaded, has join key validation, pre-join filtering, supports asof joins & its output doesn't have duplicate columns.
Convert newline-delimited JSON (JSONL/NDJSON) to CSV. See tojsonl command to convert CSV to JSONL.
qsv's Domain-Specific Language (DSL) for data-wrangling. Create multiple new computed columns, filter rows or compute aggregations by executing scripts.
Use a file dialog to choose an input file or output to a file with a save dialog.
Pseudonymise the value of the given column by replacing them with an incremental identifier.
Create a new computed column or filter rows by evaluating a Python expression on every row of a CSV, use f-strings for extended formatting, & evaluate expressions.
Replace CSV data using a regex. Applies the regex to each field individually.
Reverse order of rows in a CSV. Unlike the sort --reverse command, it preserves the order of rows with the same key.
Modify headers of a CSV to only have 'safe' names - guaranteed 'database-ready'/'CKAN-ready' names.
Randomly draw rows (with optional seed) from a CSV using reservoir sampling (i.e., use memory proportional to the size of the sample).
Infer schema from CSV data, replete with data type & domain/range validation & output in JSON Schema format.
Run a regex over a CSV. Applies the regex to each field individually & shows only matching rows.
Run multiple regexes over a CSV in a single pass. Applies the regexes to each field individually & shows only matching rows.
Select, re-order, duplicate or drop columns.
Slice rows from any part of a CSV. When an index is present, this only has to parse the rows in the slice (instead of all rows leading up to the start of the slice).
Does streaming compression/decompression of the input using Google's Snappy framing format.
Sniff & infer CSV metadata (delimiter, header row, preamble rows, quote character, flexible, is_utf8, & much more). Also a general mime type detector.
Sorts data in alphabetical (optionally case-insensitive), numerical, reverse, unique or random (optional seed) order.
Check if a CSV is sorted. May also optionally retrieve record count, sort breaks & duplicate count.
Split one CSV file into many CSV files of N chunks. Uses multithreading to go faster if an index is present.
Show aligned output of a CSV using elastic tabstops. May pair well with csvlens.
Convert CSV files to PostgreSQL, SQLite, XLSX, Parquet and Data Package.
Convert CSV to a newline-delimited JSON (JSONL/NDJSON). Also infers JSON data types.
Transpose rows/columns of a CSV.
Validate CSV data using JSON Schema Validation, along with invalid records output and an error report.