Andrew Freeman's Perspectives on Arabic Diglossia

Dec. 9, 1996

No discussion of Arabic is complete without at least a cursory discussion of diglossia. Charles Ferguson is credited with first using the term diglossia in an article which he wrote in 1959 called Diglossia. He identified four languages, Arabic, Greek, Haitian Creole and Swiss German as being prime examples of languages which fit into his definition of diglossia. Very simply stated, he said that diglossic speech communities have a High variety that is very prestigious and a Low variety with no official status which are in complementary distribution with each other, for instance the High variety might be used for literary discourse and the Low variety for ordinary conversation. His original definition of diglossia was that the two varieties which are in a diglossic relationship with each other are closely related, and therefore diglossia is not bilingualism. In his defining examples he points out that the High variety is always an acquired form, and that some educated native speakers might even deny that they ever use the Low variety. An important component of diglossia is that the speakers have the personal perception that the High variety is the "real" language and that the Low variety is "incorrect" usage. In Arabic people talk about the High variety as being "pure" Arabic and the dialects as being corrupt forms.

Since that time much has been written about diglossia. Charles Ferguson himself has commented on the weaknesses of his original article in Diglossia Revisited. (1991. The Southwest Journal Linguistics). For the most part he re-endorses his original article, but he does criticize his lack of clarity on specifying that his definition for diglossia was putative, and that the point of his original article had been to point at a phenomenon that was not well understood hoping that it would receive more attention. In the current context I feel the need to offer a coherent explanation and a proposed model for how diglossia works in Arabic. The written language was first systematically codified in the 8th century CE. The Qur'aan and the pre-Islamic poetry were the primary sources of the prescriptive standard for the written language, which has since that time been held in the highest regard by the entire Muslim community as the language of the Qur'aan and the language that the angels in heaven speak. There is some evidence that diglossia existed at that time, since this codification of the language was motivated by a desire to have recent converts to Islam learn the correct language, rather than the "corrupted" urban varieties of Baghdad and Damascus. This standard language has not changed in terms of syntax and morphology since that time. There has been a gradual shift in the lexicon so that I can't read the Sufi texts or histories from the 9th century without a special dictionary, but I can read witty entertainments or histories dating from about the 12th century on. A large body of literature has been written in this language. It is interesting to note that for a period of time Arabic was the language of scientific discourse, much like English is today. In another parallel to the current situation with English, a lot of what was written was not composed by native speakers. This codified language remains the highest standard of the language basically unchanged to this day. Conversely, the spoken language has had no official status, and the various dialects have continued to evolve since the 8th century with no attempt to form a standard.

When the split between the dialects and Literary Arabic occured is subject to debate. The prevailing view is that put forth by Ferguson in 1959 in an article entitled The Arabic KoinŽ in which he posited that all of the dialects existing outside of the Arabian peninsula had as their common source a variety spoken in the military camps at the time of the Islamic expansion in the middle of the 7th century and that this variety was already very distinct from the language of the Qur'an. In other words the dialects are not corrupt forms, but instead have had a separate existence from the Classical language for as long as they have existed outside of the Arabian peninsula.

It should also be noted that even the codified Classical Arabic has a large amount of inherent variability. Most nouns have more than one allowable broken plural. There are many synonyms for even the most common vocabulary items, and some verbs have meanings that shift drastically according to the nouns and prepositions with which they are used. One explanation for this is that the Classical language that was used for the Qur'an was a special form of the language common to all of the tribes that existed in the Arabian peninsula at the time of the prophet that had previously only been used in the traditional poetry. The age of this form is not known, but it was a conglomeration that had existed for some time and was not representative of any single dialect or any one era.

From the 15th century on, most of the Arabic speaking world was under foreign domination, either Ottoman or European. The Ottomans produced all of their official documents in Turkish and their religious documents in Arabic. The French in Algeria, between 1830 and 1962, tried to actively suppress Arabic. The British in Egypt at one point tried to make the Egyptian dialect the official language. Literary Arabic stagnated during the Ottoman and colonial period.

There are roughly four major dialect groups, a) Maghrebi (Morocco, Algeria, Tunis and western Libya), b) Egyptian (eastern Libya, Egypt and the Sudan), c) Levantine (Syria, Lebanon, Jordan and Palestine) and d) the Arabic of the Arabian Peninsula and Persian Gulf (Iraq, Saudi Arabia, Yemen, Oman, Qatar, Bahrain, the UAE and Kuwait). These categories tend to ignore the split that has always existed throughout the history of Arabic between Bedouin, Rural and Urban varieties For instance the speech of a Cairene is closer to the speech of a Damascene than it is to the speech of a Bedouin dweller of Egypt, even though I have placed Damascus and Cairo into different dialect categories. There are also some dialect isolates and relic dialects in Central Asia and in the Sahara desert.

All of the dialects share features which do not exist in Classical Arabic. For Arabs they are mostly mutually intelligible with the exception that the Maghrebi dialects are generally unintelligible outside of the Maghreb. For non-Arabs who have limited exposure to the dialects the difference between dialects can be startling. Furthermore, most Arabs know how to speak in such a way so that only people from their hometown can readily understand them. The lexical variation can be problematic. "Mara" in Palestinian means wife, but in Egyptian dialect it means "loose woman". "Masha" in Palestine means "he walked", but in Morocco means "he went". The word for "sauce pan" is "qdra", "Hilla" and "Tunjara" in Rabat, Cairo and Hebron, respectively, however all three have usable Modern Standard Arabic cognates. In Egypt and the Levant "maashi" means "allright" but in Yemen and Morocco it means "no".

The academic community in the US calls the modern form of Literary/Classical Arabic "Modern Standard Arabic" or MSA for short. An American who has only studied Modern Standard Arabic will be well received but will not understand much of the spoken discourse going on around him in an Arabic speaking country. For the most part Modern Standard Arabic is not used in spontaneous speech situations. In situations where a person has a prepared text in front of him/her, and keeps his/her remarks within the framework of the prepared text there is very little regional difference between what a reasonably educated speaker would produce as Modern Standard Arabic. As the remarks stray from the prepared text, so will the remarks also stray from Modern Standard Arabic. Some interviewers on TV and the radio are very skilled at staying in MSA for an entire interview. This form of the language is remarkably similar in all parts of the Arabic speaking world (including Dearborn, Michigan). Meanwhile the interviewee who does not have these MSA skills will start negating in dialect pretty early on, and by the end of a longer remark will probably be speaking almost entirely in dialect. King Hussein of Jordan can stay in MSA for an entire interview. Arafat doesn't even try, but he will read his speeches in pretty high fuS-Ha.

Since World War II the situation has been characterized by the end of overt colonialism. Since the end of colonialism the Arab governments have initiated mass education campaigns and have done almost nothing to stem the tide of mass rural migration to urban centers. The motivation has been that they wanted to turn their territories into modern industrialized nation-states. As a result of these social changes (disruptions?), I think we can safely say that the linguistic situation has been quite fluid during this time period.

In attempt to show how the linguistic system of modern Arabic works, El-Said Muhammed Badawi of the American University of Cairo has offered us the diagram in Figure 1.

Badawi's Diagram "Levels of Egytian Arabic"

The names of the five levels, from top to bottom, translated into English mean: the Classical Language of Tradition, the Modern Classical Language, the Colloquial of the Educated, the Colloquial of the Enlightened and the Colloquial of the Illiterate. It should be noticed that in this five level model every level includes mixing from all the other elements of the system. This is different from Ferguson's description of diglossia which states that the two forms are in complementary distribution. In this picture we can see that even the speech of the illiterate contains elements of the High variety (fuS-Ha). This picture gives a pretty good idea of the speech of an individual within the system as a function of the level and type of education that s/he has received.

Another popular model is two idealized poles, as in Figure 2.

MSA is at one end and dialect at the other with a length of speech continuum in the middle. This is essentially the Badawi diagram without specifying any of the details of the intermediate steps. The problem with both of these models is that they don't show the unity of fuS-Ha across the entire expanse of space and time at the High end of the spectrum nor the atomization of the dialects by person and location at the other. Keith Walters at the University of Texas at Austin has done a lot of work with Tunisian Arabic. He describes a situation of intrasentential diglossic switching between MSA and Tunisian Arabic where because of lexiacal overlap and morphophonolgical reductions, it is often hard to seperate the MSA from the TA. This situation has also been described by Heath in Codeswitching in Moroccan Arabic. Keith Walters in more than one article has presented us with the picture shown in Figure 3 as workable model for Tunisian Arabic.

Keith Walters' diagram "System of ETA"

This picture gives us a good picture of the complexity of the system in place in Tunis. It is also, I believe, a transportable enough model for the linguistic situation in just about any other nation in the Arabic speaking world with the proviso that in the gulf and in Egypt the codeswitching is between English and Arabic and not between French and Arabic. It deserves mention that in Tunis many speakers are truly bilingual in French, having learned it in the home, but this situation is extremely rare in Egypt or the Persian Gulf region. I also must say that I am pretty dubious about any purported MSA/CA used in extemporaneous speech. These situations are pretty much limited to University lectures, TV interviews and press conferences by heads of state and even then the level of dialect use can be quite high.

Walters' diagram, like the Badawi picture, fails to show the fundamental unity of the High variety across the entire Arabic speaking world and across the entire history of the High variety. Unlike the Badawi picture, it fails to explicitly model the place of the individual within the system, i.e. not all participants can participate equally at all levels. The third weakness in this model is that it only shows the system as it exists in Tunisia. Keith Walters gives a companion diagram shown in Figure 4, that shows the system for diglossic switching and French/Arabic codeswitching in Tunisian Arabic. It also shows the influence from other Arabic dialects on MSA and ETA, as well as the influence of TA on North African French.

All four of these models are very useful in trying to understand Arabic diglossia. My claim that stills needs to be substantiated is that they all fall short of giving us the complete picture. Dr. Badawi's picture shows the elements that might be included in a single person's speech as a function of education, but it fails to model the variation that occurs as a function of location, i.e. that the caammiyya is not uniform as we move from location to location.

Now the Walters' diagrams also fails to model the spectrum of dialects as a function of location, although he is very careful to label his diagrams as being only specific to Tunisian Arabic. Additionally, these two diagrams do not attempt to show the individual's system in the same way that the Badawi diagram does. I will state again that an important fact to bear in mind is that MSA is nearly uniform throughout the Arabic speaking world, and since it is intimately linked with Classical Arabic, nearly uniform across a span of 1400 years. It is almost impossible to overstate the status of Classical Arabic in the culture of Arabic speakers. Many Arabs will state that Classical Arabic is "the real language" and that the dialects are "corrupted" or "impure" forms.

Allow me to propose a rope structure. Included in the system is everything that has ever been written that is still accessible to the average literate reader of MSA. Each such document belongs to the MSA grammar of the individual who can still access that document. The individual Arabic grammar and lexicon of everybody who claims to speak Arabic can be represented by a single strand. These strands come together on the MSA end of the scale and form a more or less cohesive rope like structure. This rope separates into 22 less cohesive smaller twine structures as we move down the rope into national dialectal varieties. As we move down the scale from more public varieties to confessional and neighborhood varieties the fraying becomes more pronounced. These most frayed ends represent peer group "in-talk" or local trade jargons and the like. Wherever a person's lexicon, syntax, morphology or phonology match another person's, we can model that as the strands touching in that location. As we move down the scale the rope becomes ever more frayed. As per Badawi's model, an individual's access to the higher end of the rope is a function of education. Owing to the Palestinian diaspora most Arabic speakers have a passive knowledge of the Palestinian dialect, and because of the Egyptian dominance of Arabic TV and cinema most Arabic speakers have a passive knowledge of the Egyptian dialect as well. Presumably no strand will have any length of its individual strand where it does not touch at least one other strand.

I see this model as combining the Badawi model with the Walters' model. Each strand can be expanded into the Badawi picture, and the Walters' model is a more complete blow-up of the system at the point where it breaks out into 22 separate systems. The rope shows the inherent point of contact between all the systems, which is the still viable MSA.

Andrew Freeman's "rope" diagram

In closing this section I want to point out that if we try to add in every national variety of Arabic to Walters' picture we end up with something very similar to the author's "rope" picture. It bears mentioning that the Egyptian dialect has some influence on the vernaculars of the entire Arabic speaking world. Conversely, the Maghrebi dialects of Morocco, Algeria and Tunisia are heavily stigmatized and do not exert much influence on any dialects outside of their region.

The question still remains how long can this structure remain in place before individual national strands fray off the rope completely and become their language, with their own literature. The answer to this question still seems to be that it is not likely to happen soon due to the high status that Classical Arabic has. The only variety of Arabic so far to break off and form its own language is Maltese, and most scholars agree that this happened because the Maltese are Christians and don't consider the Arabic language sacred in the same way that Muslims do.

Another phenomenon that receives attention is the so-called "Middle Arabic" or "Educated Formal Arabic", that is a very classicized version of dialect or a very colloquialized version of MSA. The debate revolves around whether this is a stable form or a set of ad hoc accommodation strategies between educated speakers of mutually unintelligible dialects or if it is merely unsuccessful attempts at speaking MSA. Is this the direction of the current language change in progress for Arabic? Are the dialects moving closer to each other and to MSA at the same time, while MSA continues to be simplified and move in the direction of the dialects? These questions are very controversial. One thing that everyone agrees upon is that the national dialects are undergoing a leveling process and that there are a lot of recent borrowings from MSA into the dialects. European languages, especially English and French, still reign supreme in the realms of Science and International Commerce in the Arab world. No doubt there is lots of opportunity for ambitious linguists to study code-switching and language contact phenomena for years to come.


