Command-Line Interface (CLI)
Shekar includes a command-line interface (CLI) for quick text processing and visualization.
You can normalize Persian text or generate wordclouds directly from files or inline strings.
Usage
Commands
normalize
Normalize Persian text by standardizing spacing, characters, and diacritics.
Works with files or inline text.
Options
-i, --inputPath to an input text file-o, --outputPath to save normalized text. If not provided, results are printed to stdout-t, --textInline text instead of a file--encodingForce a specific input file encoding--progressShow progress bar (enabled by default)
Examples
# Normalize a text file and save outputshekar normalize -i ./corpus.txt -o ./normalized_corpus.txt
# Normalize inline textshekar normalize --text "درود پرودگار بر ایران و ایرانی"
wordcloud
Generate a wordcloud image (PNG) from Persian text, either from a file or inline.
Preprocessing automatically removes punctuation, diacritics, stopwords, non-Persian characters, and normalizes spacing.
Options
-i, --inputInput text file-t, --textInline text instead of a file-o, --output(required) Path to output PNG file--bidiApply bidi reshaping for correct rendering of Persian text (default:False)--maskShape mask (Iran,Heart,Bulb,Cat,Cloud,Head) or custom image path--fontFont to use (sahel,parastoo, or custom TTF path)--widthImage width in pixels (default: 1000)--heightImage height in pixels (default: 500)--bg-colorBackground color (default: white)--contour-colorOutline color (default: black)--contour-widthOutline thickness (default: 3)--color-mapMatplotlib colormap for words (default: Set2)--min-font-sizeMinimum font size (default: 5)--max-font-sizeMaximum font size (default: 220)
Examples
# Generate a wordcloud from a text fileshekar wordcloud -i ./corpus.txt -o ./word_cloud.png
# Generate a wordcloud from inline text with a custom mask
shekar wordcloud --text "درود پرودگار بر ایران و ایرانی"
\ -o ./word_cloud.png --mask Heart
shekar wordcloud --text "درود پرودگار بر ایران و ایرانی"
\ -o ./word_cloud.png --mask Heart
Note: If the letters in the generated wordcloud appear separated, use the --bidi option to enable proper Persian text shaping.