Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Precision about lang: #24

Open
secou opened this issue Oct 25, 2020 · 2 comments
Open

Precision about lang: #24

secou opened this issue Oct 25, 2020 · 2 comments
Assignees

Comments

@secou
Copy link

secou commented Oct 25, 2020

Just found in this official help page that Twitter only deals with 47 languages (look at the "lang:" section).

@igorbrigadir
Copy link
Owner

igorbrigadir commented Oct 25, 2020

Ah good point - I've been meaning to double check the languages again. That list is useful.

For my own reference later:

Current List (60):

lang:am Amharic (አማርኛ)
lang:ar Arabic (العربية)
lang:bg Bulgarian (Български)
lang:bn Bengali (বাংলা)
lang:bo Tibetan (བོད་སྐད)
lang:ca Catalan (Català)
lang:ch` Cherokee (ᏣᎳᎩ)
lang:cs Czech (čeština)
lang:da Danish (Dansk)
lang:de German (Deutsch)
lang:dv Maldivian (ދިވެހި)
lang:el Greek (Ελληνικά)
lang:en English (English)
lang:es Spanish (Español)
lang:et Estonian (eesti)
lang:fa Persian (فارسی)
lang:fi Finnish (Suomi)
lang:fr French (Français)
lang:gu Gujarati (ગુજરાતી)
lang:hi Hindi (हिंदी)
lang:ht Haitian Creole (Kreyòl ayisyen)
lang:hu Hungarian (Magyar)
lang:hy Armenian (Հայերեն)
lang:in Indonesian (Bahasa Indonesia)
lang:is Icelandic (Íslenska)
lang:it Italian (Italiano)
lang:iu Inuktitut (ᐃᓄᒃᑎᑐᑦ)
lang:iw Hebrew (עברית)
lang:ja Japanese (日本語)
lang:ka Georgian (ქართული)
lang:km Khmer (ខ្មែរ)
lang:kn Kannada (ಕನ್ನಡ)
lang:ko Korean (한국어)
lang:lo Lao (ລາວ)
lang:lt Lithuanian (Lietuvių)
lang:lv Latvian (latviešu valoda)
lang:ml Malayalam (മലയാളം)
lang:my Myanmar (မြန်မာဘာသာ)
lang:ne Nepali (नेपाली)
lang:nl Dutch (Nederlands)
lang:no Norwegian (Norsk)
lang:or Oriya (ଓଡ଼ିଆ)
lang:pa Panjabi (ਪੰਜਾਬੀ)
lang:pl Polish (Polski)
lang:pt Portuguese (Português)
lang:ro Romanian (limba română)
lang:ru Russian (Русский)
lang:si Sinhala (සිංහල)
lang:sk Slovak (slovenčina)
lang:sl Slovene (slovenski jezik)
lang:sv Swedish (Svenska)
lang:ta Tamil (தமிழ்)
lang:te Telugu (తెలుగు)
lang:th Thai (ไทย)
lang:tl Tagalog (Tagalog)
lang:tr Turkish (Türkçe)
lang:uk Ukrainian (українська мова)
lang:ur Urdu (ﺍﺭﺩﻭ)
lang:vi Vietnamese (Tiếng Việt)
lang:zh Chinese (中文)

Twitter Advanced Search UI (42) https://twitter.com/search-advanced

"ar" Arabic 
"bn" Bangla 
"eu" Basque 
"bg" Bulgarian 
"ca" Catalan 
"hr" Croatian 
"cs" Czech 
"da" Danish 
"nl" Dutch 
"en" English 
"fi" Finnish 
"fr" French 
"de" German 
"el" Greek 
"gu" Gujarati 
"he" Hebrew 
"hi" Hindi 
"hu" Hungarian 
"id" Indonesian 
"it" Italian 
"ja" Japanese 
"kn" Kannada 
"ko" Korean 
"mr" Marathi 
"no" Norwegian 
"fa" Persian 
"pl" Polish 
"pt" Portuguese 
"ro" Romanian 
"ru" Russian 
"sr" Serbian 
"zh-cn" Simplified Chinese 
"sk" Slovak 
"es" Spanish 
"sv" Swedish 
"ta" Tamil 
"th" Thai 
"zh-tw" Traditional Chinese 
"tr" Turkish 
"uk" Ukrainian 
"ur" Urdu 
"vi" Vietnamese 

Tweetdeck language dropdown list (61): https://tweetdeck.twitter.com/ (Search -> "Written in")

"en" English
"am" Amharic (አማርኛ)
"ar" Arabic (العربية)
"hy" Armenian (Հայերեն)
"bn" Bengali (বাংলা)
"bg" Bulgarian (Български)
"ca" Catalan (Català)
"chr" Cherokee (ᏣᎳᎩ)
"zh" Chinese (中文)
"cs" Czech (čeština)
"da" Danish (Dansk)
"nl" Dutch (Nederlands)
"en" English (English)
"et" Estonian (eesti)
"fi" Finnish (Suomi)
"fr" French (Français)
"ka" Georgian (ქართული)
"de" German (Deutsch)
"el" Greek (Ελληνικά)
"gu" Gujarati (ગુજરાતી)
"ht" Haitian Creole (Kreyòl ayisyen)
"iw" Hebrew (עברית)
"hi" Hindi (हिंदी)
"hu" Hungarian (Magyar)
"is" Icelandic (Íslenska)
"in" Indonesian (Bahasa Indonesia)
"iu" Inuktitut (ᐃᓄᒃᑎᑐᑦ)
"it" Italian (Italiano)
"ja" Japanese (日本語)
"kn" Kannada (ಕನ್ನಡ)
"km" Khmer (ខ្មែរ)
"ko" Korean (한국어)
"lo" Lao (ລາວ)
"lv" Latvian (latviešu valoda)
"lt" Lithuanian (Lietuvių)
"ml" Malayalam (മലയാളം)
"dv" Maldivian (ދިވެހި)
"my" Myanmar (မြန်မာဘာသာ)
"ne" Nepali (नेपाली)
"no" Norwegian (Norsk)
"or" Oriya (ଓଡ଼ିଆ)
"pa" Panjabi (ਪੰਜਾਬੀ)
"fa" Persian (فارسی)
"pl" Polish (Polski)
"pt" Portuguese (Português)
"ro" Romanian (limba română)
"ru" Russian (Русский)
"si" Sinhala (සිංහල)
"sk" Slovak (slovenčina)
"sl" Slovene (slovenski jezik)
"es" Spanish (Español)
"sv" Swedish (Svenska)
"tl" Tagalog (Tagalog)
"ta" Tamil (தமிழ்)
"te" Telugu (తెలుగు)
"th" Thai (ไทย)
"bo" Tibetan (བོད་སྐད)
"tr" Turkish (Türkçe)
"uk" Ukrainian (українська мова)
"ur" Urdu (ﺍﺭﺩﻭ)
"vi" Vietnamese (Tiếng Việt)

Premium Search Docs Language List (70): https://developer.twitter.com/en/docs/twitter-api/v1/tweets/search/guides/premium-operators

Amharic: am 
German: de 
Malayalam: ml 
Slovak: sk
Arabic: ar 
Greek: el 
Maldivian: dv 
Slovenian: sl
Armenian: hy 
Gujarati: gu 
Marathi: mr 
Sorani Kurdish: ckb
Basque: eu 
Haitian Creole: ht 
Nepali: ne 
Spanish: es
Bengali: bn 
Hebrew: iw 
Norwegian: no 
Swedish: sv
Bosnian: bs 
Hindi: hi 
Oriya: or 
Tagalog: tl
Bulgarian: bg 
Latinized Hindi: hi-Latn 
Panjabi: pa 
Tamil: ta
Burmese: my 
Hungarian: hu 
Pashto: ps 
Telugu: te
Croatian: hr 
Icelandic: is 
Persian: fa 
Thai: th
Catalan: ca 
Indonesian: in 
Polish: pl 
Tibetan: bo
Czech: cs 
Italian: it 
Portuguese: pt 
Traditional Chinese: zh-TW
Danish: da 
Japanese: ja 
Romanian: ro 
Turkish: tr
Dutch: nl 
Kannada: kn 
Russian: ru 
Ukrainian: uk
English: en 
Khmer: km 
Serbian: sr 
Urdu: ur
Estonian: et 
Korean: ko 
Simplified Chinese: zh-CN 
Uyghur: ug
Finnish: fi 
Lao: lo 
Sindhi: sd 
Vietnamese: vi
French: fr 
Latvian: lv 
Sinhala: si 
Welsh: cy
Georgian: ka 
Lithuanian: lt 

v2 API from Docs (70):

Amharic: am 
Arabic: ar 
Armenian: hy 
Basque: eu 
Bengali: bn 
Bosnian: bs 
Bulgarian: bg 
Burmese: my 
Catalan: ca 
Croatian: hr 
Czech: cs 
Danish: da 
Dutch: nl 
English: en 
Estonian: et 
Finnish: fi 
French: fr 
Georgian: ka 
German: de 
Greek: el 
Gujarati: gu 
Haitian Creole: ht 
Hebrew: iw 
Hindi: hi 
Hungarian: hu 
Icelandic: is 
Indonesian: in 
Italian: it 
Japanese: ja 
Kannada: kn 
Khmer: km 
Korean: ko 
Lao: lo 
Latinized Hindi: hi-Latn 
Latvian: lv 
Lithuanian: lt 
Malayalam: ml 
Maldivian: dv 
Marathi: mr 
Nepali: ne 
Norwegian: no 
Oriya: or 
Panjabi: pa 
Pashto: ps 
Persian: fa 
Polish: pl 
Portuguese: pt 
Romanian: ro 
Russian: ru 
Serbian: sr 
Simplified Chinese: zh-CN 
Sindhi: sd 
Sinhala: si 
Slovak: sk
Slovenian: sl
Sorani Kurdish: ckb
Spanish: es
Swedish: sv
Tagalog: tl
Tamil: ta
Telugu: te
Thai: th
Tibetan: bo
Traditional Chinese: zh-TW
Turkish: tr
Ukrainian: uk
Urdu: ur
Uyghur: ug
Vietnamese: vi
Welsh: cy

v2 API from Search Error Message (74):

am
ar
art-x-emoji
bg
bn
bo
ca
ckb
cs
cy
da
de
dv
el
en
es
et
eu
fa
fi
fr
gu
he
hi
hi-Latn
ht
hu
hy
id
is
it
ja
ka
km
kn
ko
lo
lt
lv
ml
mr
my
ne
nl
no
or
pa
pl
ps
pt
qam
qct
qht
qme
qst
ro
ru
sd
si
sl
sr
sv
ta
te
th
tl
tr
ug
uk
und
ur
vi
zh
zxx

Standard Search v1.1 said https://en.wikipedia.org/wiki/List_of_ISO_639-1_codes in https://developer.twitter.com/en/docs/twitter-api/v1/tweets/search/api-reference/get-search-tweets

v1.1 Filter stream: https://tools.ietf.org/html/bcp47 https://developer.twitter.com/en/docs/twitter-api/v1/tweets/filter-realtime/guides/basic-stream-parameters

v2: BCP47 https://developer.twitter.com/en/docs/twitter-api/tweets/search/api-reference/get-tweets-search-recent

All Possible 2 letter lang codes assigned: (184) https://en.wikipedia.org/wiki/List_of_ISO_639-1_codes

Gonna merge / test all these to double check if they actually work.

@igorbrigadir igorbrigadir self-assigned this Oct 25, 2020
@secou
Copy link
Author

secou commented Oct 25, 2020

Hey! Great job!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants