TL;DR – For implicit language acquisition, learner literature (graded readers written for language learners) is far more effective than “authentic” children’s literature. I compared samples from children's, adult, and learner literature as an objective, if not scientific, demonstration. It shouldn’t be a surprise that learner literature is more effective than "authentic" children's literature. They’re written with different purposes. Children’s literature is meant to expand the vocabulary of native speakers that already have huge vocabularies (thousands of word families), with the purpose of developing literacy, not general proficiency. Graded readers are written to take advantage of learners’ already-developed literacy, general knowledge, and comparatively even larger first-language lexicon.
The texts: The first 250 or so words of El patito feo; Tigres azules, Borges; Frida Kahlo, Kristy Placido
At the beginner and intermediate levels, *if the goal is implicit acquisition of vocabulary and grammar* this comparison shows why oft-recommended traditional children’s literature isn’t the most effective choice. I always recommend graded readers (books written for learners at different proficiency levels) over children’s or YA literature for the reasons I’ll get into below, but I was curious as to just how much of a difference there was, so I compared them according to certain variables that are widely understood by second language acquisition researchers and teachers to be important in evaluating texts’ usefulness for implicit acquisition through reading. It wasn’t at all a fair fight, so I also wanted to compare a canonical (adult) literary text. I give my reasons below for choosing these particular texts.
Criteria one: Frequency, measured using Corpusdelespañol, the project of Professor Mark Davies. Screenshots of the results in the link above. The three texts score similarly in terms of the highest-frequency vocabulary, with Borges’ taking a decent lead. But this also demonstrates why the importance of highest-frequency vocabulary is often overstated. If you look at these words, you’ll notice that there’s not a lot you can do with them. Not many verbs, adjectives, or adverbs. But by my count, Frida and Borges have more than twice as many verbs as El patito.
Criteria two: Usefulness/Versatility. Medium-and even some low-frequency vocabulary is often necessary for actually using the language in real life. If we look at the lowest-frequency words themselves, there’s a huge difference between the ‘quality’ of the Frida vocabulary in terms of its versatility for an intermediate or even beginning learner: It’s mostly commonly-used verbs and adjectives, while the “authentic” texts (texts written for native speakers) include many relatively obscure nouns. If the Corpus allowed us to compare frequency within this category, and even within the medium-frequency words, I suspect we’d see a huge difference in the numbers between Frida and the other two texts.
Known words. Paul Nation and several other scholars have found that texts need to consist of 90% known words for a reader to be able to *accurately* infer the meaning of unknown words from context, though 95-98% is usually considered ideal in these studies, and some scholars suggest that 80% is enough for high-aptitude learners. This is important because stopping to look up a word drastically cuts down on the amount of comprehensible input a reader gets, and quantity is one of the most important factors in implicit acquisition (most language acquired by highly proficient learners is acquired implicitly). Additionally, this research found that learners misinterpret new words from context more than the researchers expected. The problem so far is that high-frequency words and repetition on their own make very boring texts. This problem can be alleviated with cognates. By relying on the first language, unknown words become known words. More complicated language often ironically consists of a disproportionately higher number of cognates. In the low- and medium- frequency categories there are far more cognates in Borges than El patito. This widens the comprehensibility gap even further. But the Frida text has even more, and nearly every medium- and low-frequency word is either relatively frequent, such as estudia, juegan, y cansada, or a cognate (usually both). On top of all that, unknown words in graded readers are usually defined in the footnotes or a glossary. This is faster and more reliable than a dictionary and even software that allows you to highlight a word to look it up, such as in LingQ (which I really like) and a Kindle.
Repetition: There isn’t too much of a difference in how repetitive the texts are, at least within such a small sample. The larger the sample, the more difference you would see. In 7200 total words, there are fewer than 150 ‘unique’ words in Frida. That’s a lot of repetition, but the text itself doesn’t feel too repetitive. But again, look at which words are being repeated and you’ll find that it’s more versatile vocabulary that’s repeated in Frida, while the repeated words in Borges and Patito are more useful to the story than to your average intermediate learner's communicative needs.
*Frida Kahlo* is considered a level 2 reader. Level 1 readers will have fewer unique words, while level 3 will have many more and include the subjunctive and all tenses. Graded readers can be found at fluencymatters.com, or on Amazon and the personal pages of authors such as Bryan Kandel, Andrew Snider, Adriana Ramirez, and Bill VanPatten. The FM readers have audiobook options, as well as entire courses built around each book, but some of them will feel a bit too juvenile to adults. Their most recent books come with two versions: one in present, one in past tenses. VanPatten’s books should be read when you feel you’re ready for “authentic” texts to bridge the gap. VanPatten is one of the most renowned experts in Second Language Acquisition research and the most involved in influencing actual language teaching. He retired from professorship to write fiction. Most of the other authors are teachers.
I chose Borges because his writing has the reputation of being difficult, though possibly more for his ideas than the language itself. The specific story, *Tigres azules* was a random choice. I compared it with the first page of *El aleph* and the results were similar. I chose *El patito feo* because it’s often the example people use when advocating for children’s literature to be used. More modern children’s literature would fare better, but it’s more difficult to find for free or for a low cost. New children’s books in Spanish also get to be expensive unless you can find them in the library, where the choice is usually limited.