Indo-European is the world’s largest language family by number of speakers, with close to half of the world’s population. Not only is it the dominant family of India, Europe, and much of the land in between (hence the name), but it has become dominant in the Western Hemisphere, and an important cultural presence in Sub-Saharan Africa and northern Eurasia, through European conquest and colonization; English, Spanish, Portuguese, French, and Russian are all Indo-European.

Indo-European is also the most studied language family; the relationships among dialects are more certain, the possibility of future changes in thinking less likely, than for other large families. In fact, the study of Indo-European dialects led to the development of the central methods of linguistic comparison and proto-dialect reconstruction. Indo-European was identified through the similarities between Latin, Ancient Greek, and Sanskrit. All three of these classical dialects had extensive literature available to modern scholars, many of whom were familiar with all three, and noticed the similarities — similarities far greater for these ancient dialects than related modern dialects that have undergone another two millennia of evolution. Because these dialects were spoken by literate societies, we also have an extensive record of their descendants over time, giving us a better idea of how they were pronounced. All three were written in scripts that recorded speech sounds; by contrast, Chinese, which also has an extensive ancient literature and historical continuity, was originally written in a script that was largely ideographic, and thus preserved little information about its pronunciation.

The existence of Indo-European implies a proto-dialect, a single spoken dialect from which all Indo-European dialects descended. This proto-dialect was spoken somewhere in western Eurasia — the location is still in dispute — and gradually expanded, developing subdialects as the Indo-Europeans spread out beyond the reach of regular communication. These subdialects developed into the branches of Indo-European, groupings of descendant dialects, many of them constituting the modern spoken Indo-European dialects.

Indo-Iranian groups the Indic and Iranian branches, widely considered to be the most closely related branches.

Indic is the largest branch of Indo-European, comprising descendants of the pre-classical form of Sanskrit (संस्कृतम् « Săskŗtam »). Sanskrit was spoken by the Aryans, who entered the Indus Valley from the northwest around four thousand years ago; ‘Indo-Aryan’ is a synonym for ‘Indic’ and a collective term for Indic speakers. Nearly all of the dialects exist in a single dialect continuum (depicted at right, in blue) that covers the Indo-Gangetic Plain and extends into the highlands to the south. The local speech of Delhi (दिल्ली « Dillī »), known as Khari Boli (खड़ी बोली « K‛arī Bōlī » / كهڑى بولى « Kharī Boŭlī »), was adopted as a lingua franca by the Mughal rulers, becoming known as Urdu (اردو « Ɔurdū », “camp language”), which later gave rise to standard Hindi (हिन्दी « Hindī »); Hindi-Urdu can be considered a single dialect. The continuum comprises countless local dialects, but the following major groups, west to east: Sindhi (سنڌى « Sind‛ī »); Punjabi (پنجابى « Panĝābī » / ਪੰਜਾਬੀ « Păjābī »); Gujarati (ગુજરાતી « Gujarātī »); four different “Hindi” groups, which can be distinguished roughly as Rajasthani (राजस्थानी « Rājast‛ānī »), Madhyadeshi (मध्यदेशी « Mad‛jadēśī », including Khari Boli), Uttardeshi (उत्तर्देशी « Uttardēśī »), and Bihari (बिहारी « Bihārī »); Pahari (पहाड़ी « Pahārī »), including Nepali (नेपाली « Nēpālī »); an eastern group, including Oriya (ଓଡ଼ିଆ « Oriā »), Bengali (বাংলা « Bā~lā »), and Assamese (অসমীয়া « Asamījā »); and a southern group, including Marathi (मराठी « Marāt‛ī ») and Konkani (कोंकणी « Kō~kanī »), disconnected from the Hindi parts of the continuum to the north, but connected to the east. Dialects from within the continuum were carried by emigrants overseas, primarily to other parts of the British Empire. The dialects outside of the continuum are insular, primarily Sinhalese (සිංහල « Sĩhala ») and Maldivian (ދިވެހި « Divehi »).

Indic vocabulary is widely used by non-Indic dialects whose speakers are predominately Hindu (as the Dravidians) or Buddhist (as in mainland Southeast Asia and Tibet), owing to the writing of Hindu scriptures in Sanskrit and many Buddhist scriptures in Pali (පාළි / पाळि « Pāli »), a now-extinct descendant of Sanskrit, probably from the Bihari group.

Iranian is spoken in the highlands between Anatolia and India. The main dialects, west to east, are Kurdish (Kurdî), Persian or Farsi (فارسى « Fārsī » / پارسى « Pārsī »), Baluchi (بلوچى « Balūčī »), and Pashto (پښتو « Patoŭ »). Farsi, essentially the same as the eastern Persian dialects of Dari (درى « Darī ») and Tajik (Тоҷикӣ « Tožikī » / تاجيكى « Tāĝīkī »), is widespread geographically as a vernacular owing to the Persian empire, and historically influential in other dialects. Ossete (Ирон Ćвзаг « Iron Ćvzag ») and the extinct dialect of the Zoroastrian scripture, the Avesta, are also Iranian dialects.

Germanic comprises two extant branches, West Germanic and North Germanic. West Germanic is far larger, owing especially to the presence of English. English derives from dialects spoken at the northern end of the West Germanic region, around the Jutland peninsula, transplanted to Britain and transformed under the influence of Norman French as well as North Germanic. English’s closest Germanic relatives are still spoken by small populations in the same original area, namely three varieties of Frisian, West (Frysk), East (Seelterfräisk), and North (Frasch / Friisk). The rest of West Germanic consists of a single dialect continuum (depicted at right, in blue). The continuum’s main standardized dialects are High German (Deutsch) and Dutch (Nederlands); others are Luxemburgish (Lëtzebuergesch, an official dialect of Luxemburg) and Low German (Nedderdüütsch). Two dialects have been transplanted from the continuum: the Dutch-related dialect Afrikaans and the High German-related dialect Yiddish (יידיש « Jīdīš »), spoken by the Ashkenazi Jews.

North Germanic is can be divided into two main groups of dialects, Mainland Scandinavian and Insular Scandinavian; each group consists of more than one conventional “language”, but the dialects in each group are largely mutually intelligible. Mainland Scandinavian comprises two Norwegian dialects, standard Norwegian (Norsk / Bokmĺl) and New Norwegian (Nynorsk), as well as Danish (Dansk), Swedish (Svenska), and Scanian (Skaűnsk). Insular Scandinavian comprises Icelandic (Íslenzka) and Faroese (Fřroysk).

Italic comprises the descendants of Latin (LATINA), collectively known as the Romance dialects, as well as a few dialects related to Latin that are no longer spoken. Latin was the local dialect of Latium (LATIVM), the area around Rome, spread throughout Europe by that city’s empire. Most of the Romance dialects exist in a single continuum across Western Europe, with the main standardized dialects being Portuguese-Galician (Portuguęs-Galego), Spanish (Espańol / Castellano), Catalan (Catalŕ), Occitan, French (Français), Northern Italian (Gall-Italich) and standard Italian (Italiano). In Iberia, the continuum is only operational in the north; dialects from three distinct points in that continuum were spread south with the Reconquista, the reconquest of Iberia from the Moors, so that Portuguese, Spanish, and Catalan (as spread by the expanding states of Portugal, Castille, and Aragon, respectively) are spoken in distinct bands along the southern portion of the peninsula. Disconnected from the continuum are the spatially-isolated Romanian (Româna / Româneşte), spoken throughout cultural România (which includes Moldova), and the insular Sardinian (Sardu). The continuum is not operational at all, of course, for those dialects that have been transplanted outside of Europe, namely the three main imperial Romance dialects, Spanish, Portuguese, and French, as well as the Spanish dialect of the Sephardic Jews, Ladino (לדינו « Ladīnoŭ »).

Latinate vocabulary has been heavily borrowed into European dialects, owing to Latin’s status as lingua franca in the western Roman Empire and its continued use in medieval Europe for religious and scholarly purposes. Even Romance dialects borrow from classical Latin, so that modern Romance dialects (and English, with its large French component) contain numerous vocabulary doublets, with an evolved term typically used for everyday concepts existing alongside a cognate reincorporated from classical Latin, typically used for technical, educated, or religious concepts.

Celtic was spoken at one point in central Europe, but gradually pushed to the northwestern fringe, where its dialects are spoken by decreasing numbers. The Brythonic branch of Celtic consists of the largest remaining dialect, Welsh (Cymraeg), and Breton (Brezhoneg), and contained the extinct Cornish (Kernewek). The Gaelic (or Goidelic) branch comprises the largely-intelligible Irish Gaelic (Gaeilge) and Scots Gaelic (Gŕidhlig / Albannach) and the extinct Manx (Gaelg). The long-extinct Gaulish was also Celtic, and so, probably, was Pictish.

Celtic and Italic are sometimes thought to have a closer relationship with each other than with other branches.

Balto-Slavic groups the Baltic and Slavic branches, generally accepted as being closely related.

Baltic is so named because its dialects are or were spoken along the coast at the eastern end of the Baltic Sea. The Baltic branch is thought to be more conservative than most branches, and therefore is closer to the reconstructed Indo-European. There are two main extant dialects, Latvian (Latviski) and Lithuanian (Lietuviškai). Old Prussian, spoken in Prussia before German supplanted it, is extinct.

Slavic: See Slavs.

Hellenic is simply Greek, Ancient (ΕΛΛΗΝΙΚΗ « ‛Ellēnikē ») and Modern (Ελληνικά « Ellēnika »). The vernacular form of Modern Greek is Demotic (Δημοτική « Dēmotikē »). Katharevusa (Καθαρεύουσα « Kat‛areuousa ») is a modern standard that uses classical Greek forms but modern (that is, Demotic) pronunciation, confined to educated communication but largely abandoned decades ago. As the lingua franca of the eastern Roman Empire, the dialect of the New Testament, and a major influence on Latin in classical times, Greek has been a large source of vocabulary in other dialects throughout Europe and Christendom.

Armenian (Հայերէն « Hajerēn ») is a branch by itself, with two standardized (but mutually intelligible) dialects, eastern and western. Eastern is based on the vernacular in the present Armenian state. Western Armenian is based on the vernacular used in the Ottoman Empire before the genocide.

Albanian (Shqip) is also a branch by itself, with two main modern dialects, the southern Tosk and the northern Gheg, with a transitional dialect between them. Though Tirana is in the Gheg region, the written standard throughout cultural Albania is based on Tosk.

Anatolian is the most distinctive branch of Indo-European, and thus presumably the first branch to break off. It was spoken in Anatolia in ancient times, and has no living descendants. The best-recorded Anatolian dialect was Hittite (𒉈𒅆𒇷 « Nešili »), official dialect and original lingua franca of the Hittite Empire, but the most widespread dialect was Luwian (« Luwili »), spoken in the late empire and in the various successor states around Anatolia and the Levant, and possibly in Troy.

Tocharian dialects, now extinct, were spoken along the Northern Silk Road, north of the Tarim Basin, the easternmost of the Indo-European dialects, known from texts dating to the sixth to eighth centuries. Texts in Tocharian A (« Ārśi ») are more confined geographically (to the east) and functionally (to religion) than texts in Tocharian B (« Kuśińńe »), which show evidence of vernacular use.



