Simple but powerful batch data generator for deterministic or randomized datasets.
- Lean Python DSL for rapid data synthesis.
- Batch-friendly CLI with stdin streaming and file-based workflows.
- Extensible via inline Python blocks and a rich prebuilt toolbox (
math,random,rint,rstr,rfloat).
pip install mkdata# Assuming you have UV installed, run:
uv tool install mkdata- Clone the repository:
git clone https://github.com/RayZh-hs/mkdata.git && cd mkdata. - Build a wheel:
python -m build. - Install the freshest artifact:
pip install "$(ls dist/mkdata-*-py3-none-any.whl | sort | tail -n 1)".
Run a generator script (conventionally *.gen):
mkdata path/to/script.genReading from stdin instead of a file:
mkdata -You can find examples to run in the examples/ directory.
Save this file to hello.gen
@run {
n: rint(1, 100) \n # Random integer between 1 and 100, inclusive
@loop n { # Repeat n times
rint(1, 20) # By default entries concat with space
}
\n # End with newline
}
mkdata hello.genYou should see the output in stdout, eg.
5
17 3 19 8 4
A full set of examples is available in the examples/ directory.
(%)(variable:) expression (\suffix) (# comment)
%hides output while still evaluating the expression.variablestores the evaluated result in the runtime environment.expressionmust be valid Python.\suffix customizes the string appended to the output (\nfor newline, omit for empty).
Use @ to introduce a directive. They can be nested and combined freely.
You must use braces { ... } to denote the scope of a directive. Scopes should follow the Google convention for braces (opening brace on the same line, closing brace on its own line, unless an opening brace follows immediately).
| Directive | Purpose |
|---|---|
@run { ... } |
Defines an execution scope and starts the interpreter loop. |
@python { ... } |
Executes embedded Python for custom helpers. |
@redirect stdout|stderr|path |
Redirects output to standard streams or files. |
@loop count { ... } |
Repeats a block count times. |
@for target in iterable { ... } |
Iterates with optional index tracking. |
@any { ... } { ... } |
Picks one block at random (weights optional). |
- Create a virtual environment and activate it.
- Install locally in editable mode:
pip install -e .. - Install tooling:
pip install pytest. - Run the test suite:
pytest.
Formatters, linters, and additional tooling are welcome. See CONTRIBUTING.md for coordination tips.
- Contribution guide:
CONTRIBUTING.md - Release highlights:
CHANGELOG.md - License:
LICENSE
This project uses the following open source packages:
- pytest for end-to-end testing.
The project's logo is designed by apien from Flaticon.
