Apply series of string, date, math, currency & geocoding transformations to a CSV column. Also has basic NLP functions (similarity, sentiment analysis, etc.).
A slimmed-down version of apply with only Datapusher+ relevant subcommands/operations (qsvdp binary variant only).
Available on the qsv CLI tool!
Drop headers from a CSV.
Concatenate CSV files by row or by column.
Provide data as input from your clipboard or save output to your clipboard.
Available on the qsv CLI tool!
Count the rows in a CSV file.
Remove duplicate rows. (See also extdedup, extsort, sort & sortcheck commands).
Infer extended metadata about a CSV using a GPT model from OpenAI's API.
Find the difference between two CSVs.
Add a new column enumerating rows by adding a column of incremental or uuid identifiers. May also use to copy a column or fill a new column with a constant value.
Exports a specified Excel/ODS sheet to a CSV file.
Removes a set of CSV data from another set based on the specified columns.
Available on the qsv CLI tool!
Explode rows into multiple ones by splitting a column value based on the given separator.
Available on the qsv CLI tool!
Remove duplicate rows from an arbitrarily large CSV/text file using a memory-mapped, on-disk hash table. Unlike dedup, extdedup doesn't sort the deduped file.
Available on the qsv CLI tool!
Sort an arbitrarily large CSV/text file using a multithreaded external merge sort algorithm.
Available on the qsv CLI tool!
Fetches data from web services for each row w/ HTTP GET. Has HTTP/2 adaptive flow control, jql JSON query language support, dynamic throttling (RateLimit) & more.
Available on the qsv CLI tool!
Similar to fetch, but uses HTTP Post. (HTTP GET vs POST methods)
Available on the qsv CLI tool!
Force a CSV to have same-length records by either padding or truncating them.
A flattened view of CSV records. Useful for viewing one record at a time. e.g. qsv slice -i 5 data.csv | qsv flatten.
Reformat a CSV with different delimiters, record terminators or quoting rules. (Supports ASCII delimited data.)
Loop over a CSV to execute shell commands. (not available on Windows)
Available on the qsv CLI tool!
Build frequency tables of each column. Uses multithreading to go faster if an index is present.
Geocode CSV data using an updatable copy of the Geonames Cities database.
Available on the qsv CLI tool!
Show the headers of a CSV. Or show the intersection of all headers between many CSV files.
Available on the qsv CLI tool!
Create an index for a CSV with constant time indexing/random access. Multiple speed benefits & features when an index is present.
Available on the qsv CLI tool!
Read CSV data with special quoting, trimming, line-skipping & UTF-8 transcoding rules. Typically used to 'normalize' a CSV for processing with other commands.
Inner, outer, right, cross, anti & semi joins. Automatically creates a simple, in-memory hash index to make it faster.
Unlike join, can process files larger than RAM, multi-threaded, has join key validation, pre-join filtering, supports asof joins & its output doesn't have duplicate columns.
Available on the qsv CLI tool!
Convert newline-delimited JSON (JSONL/NDJSON) to CSV. See tojsonl command to convert CSV to JSONL.
Available on the qsv CLI tool!
qsv's Domain-Specific Language (DSL) for data-wrangling. Create multiple new computed columns, filter rows or compute aggregations by executing scripts.
Use a file dialog to choose an input file or output to a file with a save dialog.
Available on the qsv CLI tool!
Pseudonymise the value of the given column by replacing them with an incremental identifier.
Available on the qsv CLI tool!
Create a new computed column or filter rows by evaluating a Python expression on every row of a CSV, use f-strings for extended formatting, & evaluate expressions.
Available on the qsv CLI tool!
Replace CSV data using a regex. Applies the regex to each field individually.
Reverse order of rows in a CSV. Unlike the sort --reverse command, it preserves the order of rows with the same key.
Modify headers of a CSV to only have 'safe' names - guaranteed 'database-ready'/'CKAN-ready' names.
Randomly draw rows (with optional seed) from a CSV using reservoir sampling (i.e., use memory proportional to the size of the sample).
Infer schema from CSV data, replete with data type & domain/range validation & output in JSON Schema format.
Run a regex over a CSV. Applies the regex to each field individually & shows only matching rows.
Run multiple regexes over a CSV in a single pass. Applies the regexes to each field individually & shows only matching rows.
Select, re-order, duplicate or drop columns.
Slice rows from any part of a CSV. When an index is present, this only has to parse the rows in the slice (instead of all rows leading up to the start of the slice).
Does streaming compression/decompression of the input using Google's Snappy framing format.
Available on the qsv CLI tool!
Sniff & infer CSV metadata (delimiter, header row, preamble rows, quote character, flexible, is_utf8, & much more). Also a general mime type detector.
Sorts data in alphabetical (optionally case-insensitive), numerical, reverse, unique or random (optional seed) order.
Check if a CSV is sorted. May also optionally retrieve record count, sort breaks & duplicate count.
Split one CSV file into many CSV files of N chunks. Uses multithreading to go faster if an index is present.
Available on the qsv CLI tool!
Show aligned output of a CSV using elastic tabstops. May pair well with csvlens.
Convert CSV files to PostgreSQL, SQLite, XLSX, Parquet and Data Package.
Available on the qsv CLI tool!
Convert CSV to a newline-delimited JSON (JSONL/NDJSON). Also infers JSON data types.
Transpose rows/columns of a CSV.
Validate CSV data using JSON Schema Validation, along with invalid records output and an error report.