Resources for VocabSieve
VocabSieve supports a range of different local resources, which you can use without an internet connection.
Supported files
- StarDict, the most commonly available free dictionary format online
- Migaku (.json)
- JSON (.json, plain key-value pairs)
- Kaikki.org Wiktionary data dumps
- CSV
- DSL
- MDX
MDX dictionaries are often in heavy HTML format for style, but VocabSieve can only handle text-based definitions. They are converted into text before being shown, which may not always work well.
- JSON frequency lists (as a simple list of words in json format)
- Sound libraries (a directory of audio files)
Dictionaries
Listed in order of preference by author.
Kaikki Wiktionary dumps
High-quality parsed data of Wiktionary in various languages. Prefer these over the online Wiktionary API as they contain more information. The English and French Wiktionaries (note this is the language the entries are written in) contain a large number of entries in other languages. All the other versions also contain a lot of entries in their own language, which can be useful as monolingual dictionaries.
Hu Zheng (StarDict author) personal website, over 100 dictionaries
StarDict dictionaries converted by StarDict’s author from various formats. They are usually of decent quality and is plaintext, which is suitable for display in VocabSieve and Anki. StarDicts need to be extracted first before importing. Select the .ifo file in the extracted folder.
The website has been dead on a while, but many of the files are archived on Wayback Machine:
https://web.archive.org/web/20230717122310/https://download.huzheng.org/
Lingvo DSL
Rutracker GoldenDict Dictionaries (Russian, English, Ukrainian)
https://rutracker.org/forum/viewtopic.php?t=3369767 (Page in Russian, click “скачать” to download torrent)
Website with Lingvo dictionaries (Website in Russian; Some free resources but mostly requires payment)
Another website with Lingvo dictionaries (Website in Russian, mostly free resources)
A bunch of dictionaries for GoldenDict, organized by language. Avoid MDX format as they usually look worse than DSL.
https://cloud.freemdict.com/index.php/s/pgKcDcbSDTCzXCs
Apple Dictionaries, 41 dictionaries, some bilingual
https://cloud.freemdict.com/index.php/s/HsC7ybBWsbZ7B4N
Navigate to “json” folder and download items for your language. Note that the bilingual dictionaries listed include entries in both directions. For example, an English-Spanish dictionary contains both English words defined in Spanish as well as Spanish words defined in English. You do not need to extract the files in order to import them.
Migaku dictionaries
Migaku Official MEGA Folder, 11 languages
https://mega.nz/folder/eyYwyIgY#3q4XQ3BhdvkFg9KsPe5avw/folder/bz4ywa5A
Frequency lists
Lemmatized English frequency list
https://github.com/FreeLanguageTools/resources/raw/master/freq/freq_en.json.gz
Lemmatized Russian frequency list
https://github.com/FreeLanguageTools/resources/raw/master/freq/freq_ru.json.gz
Cognate data
CogNet processed data processed for VocabSieve, includes all languages, may take a while to import.
https://github.com/FreeLanguageTools/resources/raw/master/cognates.json.gz
Audio folders
These need to be extracted into a folder first before importing. The containining folder should be selected for import. Do not delete the files as they are not copied.
Lingua Libre sound libraries
https://lingualibre.org/datasets/
Forvo dump in various languages (may not be as complete as the online version)
https://cloud.freemdict.com/index.php/s/pgKcDcbSDTCzXCs?path=%2F0%20Forvo%20audio