When can we say a child has acquired their L1?

This question (or something like it) was posted on twitter some time in May and it led to a bit of back and forth, with not a lot of consensus on the right answer. I contributed a few tweets, but because I was travelling at the time it was hard to engage properly in the ongoing conversation. Moreover, Twitter isn’t a great medium, imo, for the kind of discussion the question deserves. I said I might write something about this when I was back home, and it’s taken me a bit of time to settle in and actually write up my thoughts, but here they are.  Fair warning, I have a lot to get done in my last week or so of sabbatical, so I wrote this quickly and hence it isn’t very polished. It also isn’t as nuanced as it could be, but for me, for now, it’ll do. (A new attitude I’m trying out. For those who know me, not a natural way of being for me.)

 

It’s a deceptively simple question, and it’s not uncommon for textbooks to say that the average child has acquired most of the basics by 3 or 4 years of age.  Not surprisingly, then, the OP got a few responses along the lines of “I’ve heard 3 or 4.” As well as ones that went on to say things like “but I think it’s more like 5 or 6.” Then there were a whole lot of other responses that were deemed (by someone else, not the OP) to be unhelpful. I assume that the unhelpful comment was about responses that basically said “never” or even worse “the question doesn’t make sense”. I am in the camp of the latter response, and here’s a bit of an explanation as to why. I’ll break down a few of the reasons I think the question is troublesome.

I should be clear that I don’t mean to pick on the OP: they asked what to many people seems like a perfectly reasonable question, and any points they (or others) brought up in response to answers were likely quick responses in medium that is not good for long, nuanced, reasonable exchanges. So I use them as ways to talk about why the question is problematic, not why the person is wrong. Because of this, while I use things that were said in my response, I won’t ID any of the twitter handles.

 

1) What do you mean by ‘their language’? To answer a question about when something happens you need to know what the something is. So what does ‘their language’ mean.

In an effort to be more specific (which is good) in response to the question of what they meant by language, the OP distinguished between ‘their idiolect’ and ‘their language’. But since any language is really just a label we give to a collection of idiolects, I am assuming they meant shared vs. non-shared things. But even that is not as unproblematic as one would like, especially given we are talking about things that have been learned. Whether one thinks that some things about language are innate or not, the things that are presumed, by some, to be innate are not the things the OP was wondering about. And there is emerging evidence that there is more variation in language knowledge in adults than we might have assumed, meaning that we would also have to define what we mean by ‘shared’ (i.e. the stuff that is part of their ‘language’). If it’s 100% agreement among adult speakers of the language, then there might be less left than we think (Dabrowska, 2018), as anyone who’s ever sat in on an undergrad (or grad for that matter) syntax class can attest. I’ve never heard a native English speaker say that “ran the dog after cat the” is grammatical, but you get much beyond basic word order and you inevitably have at least a few dissenters here and there. That’s not to say that there is not high levels of agreement on many things, but a high level of agreement is not total agreement. We could of course include things without perfect agreement, but that would require some specific pre-defined level of agreement (given a specific sample size) to include things where there is less than perfect agreement. Moreover, we often appeal to idiolects to explain (within dialect) variation. When you say ‘in my idiolect it works this way’ it means that the version of the language you have acquired  works that way, that is, you are including the idiolectal in the language that has been acquired.

This may seem like an absurd thing to bring up, but we need to define what we are asking about, and even that basic step is more fraught than it would at first appear. I mean, we all know what we mean when we say the child is learning English, or French, or ISL, right? Yes, of course we do, but saying the child is learning English is different than making a claim about when she is has actually ‘acquired’ English. That entails a target that is reached, and my point is that it’s not totally clear what that target includes, so how can we decide that it has been reached. It’s kind of like asking someone how long it will take to drive to Toronto from somewhere. Do you mean to the outskirt of the region people refer to as Toronto, the actual boundary of the city proper, or to downtown? Those are 3 different places and they take three different times to reach.

 

2) Even if we could distinguish between their ‘language’ and their idiolect, there are still a lot of ways we could define ‘language’ and this was pointed out. Setting aside for now what we mean by ‘acquired’ (more on that later) do you care about the phonological system, the syntax, the semantics, the pragmatics, sociolinguistic variation? All of it? Many people would likely leave out the lexicon, as we know that we continue to learn new words throughout our lifetime and it feels odd to say that someone who doesn’t know every word in a language (i.e., every word used by some speaker) hasn’t yet acquired the language. (But even then, would we want to say someone who only knows 500 words knows the language? I don’t know the answer to this question, but the point is that these sorts of questions are tricky.) But if we care about the learning of culturally shared things, then things like socio-linguistic variation would seem to be especially relevant, as no one, regardless of their theoretical orientation, would suggest that these things aren’t learned. Other things fit less clearly (e.g. affix ordering). Some people thing everything is learned, others think some things aren’t. So your theory can define what you include in the set of things that are learned and so is relevant for answering the question. Here I am not taking a stand on what is versus what isn’t learned, my views on that are not relevant to the point I am making. What is relevant is that what you include in “their language” is not theory neutral, if what you want is to include only things that are clearly learned.

 

3) Then there is the question of what ‘acquired’ means. There was some mention of an inflection point, where, e.g., learning stops being fast and starts being slow, in order to accommodate the known ongoing changes in adult language (e.g., changes in RC processing, Wells, Christiansen, Race, Acheson, MacDonald, 2009), the idea being that the inflection point would be the time of interest. In principle, the idea of an inflection point is reasonable (at least to me), but since each form in the language likely has its own, deciding on how to create a super measure that would include them all (even within a specific domain of interest) would be arbitrary. (The extant data suggest that each individual aspect of language has its own trajectory, where by individual aspect I don’t mean things like ‘passives’ or ‘relative clauses’ or ‘tense’, I mean things at a more fine-grained level than that: e.g., verb agreement appears to be acquired on a morpheme by morpheme basis (possibly even a verb+agreement morpheme by verb+agreement morpheme basis. So you’d need an inflection point for each individual ‘thing’). I am not saying that we couldn’t create a measure of average inflection points (across aspects of the language), just that, however we do it, it will be arbitrary. Moreover, I would actually be surprised if every child showed the same ‘super-line’, that is, if an averaged inflection point looked the same for all children (it certainly doesn’t for word learning, e.g., see work on the naming insight/naming explosion). And for children in different cultures where there are different practices surrounding talk to children timing may be affected (Shneidman & Goldin-Meadow, 2012). (It’s easy enough to have different ages at which the language is considered acquired for different cultural and linguistic groups, but that has the potential to bring with it a lot of unnecessary baggage.) Additionally, what are we measuring? Production? Comprehension? Generalization? Correct generalization into all possible contexts? All of these?

 

Basically, while it’s possible, in principle, to figure out exactly what you mean by acquired, for what, and how you are going to measure it, it’s not at all clear to me that it’s in any way interesting, or more importantly, meaningful, to do so, because time point that comes out as a result will be an arbitrary one, that is only meaningful within the specific definitions. For these, and other reasons then, I don’t think the question makes sense.

Now back to my sabbatical to do list in a vain attempt to get one or two things I had planned to do actually crossed off before it ends.

 

Dabrowska, E. (2018). Experience, aptitude and individual differences in native language ultimate attainment. Cognition, 178, 222-235.

Shneidman, L. A., & Goldin-Meadow, S. (2012). Language input and acquisition in a Mayan village: how important is directed speech?. Developmental science, 15(5), 659–673. doi:10.1111/j.1467-7687.2012.01168.x

Wells, J. B., Christiansen, M. H., Race, D. S., Acheson, D. J., MacDonald, M. C. (2009). Experience and sentence processing: statistical learning and relative clause comprehension. Cognitive Psychology, 58(2), 250-271. doi: 10.1016/j.cogpsych.2008.08.002.

Methods sections and (avoidance of) self-plagiarism – some thoughts and a practical solution – Guest post by Dr. Matt Dye

The following is written by Dr. Matt Dye of the National Technical Institute for the Deaf in Rochester NY. It’s a follow up on my related post. You can find more information on him and his research here: http://www.deafxlab.com/

Ah, the Methods section! Perhaps the driest, yet most important, section in a research article. That section undergraduates always decide to skip, and of which reviewers always ask for more clarification.

I’d start by dialing back the sarcasm, and reiterating that this is perhaps the most important section in the article. Along with the Results section, it allows the educated reader to discern the quality of the science being reported. So, let’s not skimp. However, we have all felt the pain of trying to say the exact same thing using different words. Here we have the undergraduate refrain, “But the authors said it so eloquently, I couldn’t find a way to paraphrase it without making it worse!” However, we are, for the most part, not undergraduates submitting our work to academic journals for peer review. So, we cannot get out of it that way.

I like to think that after me doing my best technical writing, and one (or two) rounds of responding to peer review, my Methods section is as tight as it can get. Scientific protocol rendered into perfect prose. But if I am honest with myself, then of course there is room for improvement. Herein follows a suggestion that could (a) improve the Methods section, (b) result in a rewritten Methods that hopefully avoids charges of text recycling, and (c) provide a valuable educational experience for our postdocs and students:

1.  Ask a trainee in your lab to replicate the setup of your study. From scratch. Using only your Methods section as a guide. Here lies a critical test of how well that section is written.
2.  Assess how well the trainee was able to do so. Could she accurately reproduce the same procedure, or did she have to request information not in the manuscript? Were there any differences between her setup and the one you expected?
3.  If there were errors, or required information was missing, ask the trainee to rewrite the Methods section to provide the necessary information. Assign authorship credit and acknowledge contribution to the new manuscript.
4.  If there were no errors, ask the trainee where she was uncertain or where she had to struggle to figure out what to do. Ask the trainee to rewrite the Methods section to make clearer the necessary information. Assign authorship credit and acknowledge contribution to the new manuscript.
5.  Repeat process for each manuscript using the same (or very similar) methods. As soon as you have reached the point of perfection (or massively diminished returns on time invested):
*   Cite the latest iteration in new submissions;
*   Make a preprint of the article with that version publically available on your     website (and make sure that it can be downloaded anonymously);
*   In your cover letter, let the Editor know this process (or a better version – which I’m sure is possible.)

We end up with better trained students and postdocs (who also get appropriate author credit for their CVs), improved replicability of methods, and less chance of desk rejection from hard-working and under-appreciated editors.

Matt Dye PhD FPsyS
RIT/NTID
http://www.deafxlab.com/

Methods sections and (avoidance of) self-plagiarism

A while back I posted the following on twitter: “I hate writing methods sections for work that is the same as previous work: tweaking wording that works to avoid self-plagiarism is tedious.”  My twitter auto posts to Facebook, so my friends there saw it too. Interestingly, on FB I mostly got commiseration from others who similarly dislike having to do this (but who mostly also seem to do it). On twitter, however, the responses were mostly advice to ‘cite and copy’, and there were several people referring to COPE guidelines as justification.

The issues surrounding self-plagiarism or text recycling as this practice is sometimes called are too complex, I think, for a series of response tweets. So I decided to write a blog post about it, introducing the issue from my perspective. I also invited some friends and fellow researchers who commented on FB to contribute their perspective. They will be added later and the post updated. (Here’s a link to one of them.)

From me:

I do behavioural work. Many of my studies are trying to understand aspects of language learning. How do we learn languages? What is easy or hard to learn? For whom? And why? Studying real people learning real languages can lead to hypotheses, but it is hard to definitely answer these questions using real life learners learning real life languages in real life. (Yes, I know that that is terribly repetitive. But it gets my point across.) So researchers in my field have to do something different to really get at the questions we are interested in. The paper in question discusses a study using a miniature artificial language methodology. Let me give you some background on this methodology. Hint: when I say methods I don’t mean (just) statistical methods or analyses, I mean the whole design of the study from start to finish. In my field, a lot of the ‘heavy-lifting’ is done in the design, meaning, the stimuli and test items. The nature of the data collection process is crucial and can be quite complex.

Although the general method has a pretty standard abbreviation (MAL, which I will use from hereon in), there is nothing whatsoever that is standard about MAL methods.  Each MAL is constructed to get a specific question. Basically the process of Mal development goes something like this: the researcher thinks about the specific variables they are interested in isolating in a language/learning situation/learner, and then designs a language or set of languages (each given to a different condition) that varies on that single variable. Michael Erard (he writes a lot of great stuff about language) did a piece on MALs a few years back that explains the process and intent behind it well. https://motherboard.vice.com/en_us/article/sillyspeak-the-art-of-making-a-fake-language

In any case, each language is unique and the specifics of the language need to be described in enough detail so a reader can evaluate whether it actually gets at the question it was supposedly designed to get at. In my work, I use a variety of different kinds of MALs to get at different kinds of questions. Sometimes the ‘language’ is just sounds. These are used, for instance, when researchers are interested in the kinds of statistical computations learners can perform and whether those computations can help you discover the kinds of patterns that exist in real languages. This line of work got its start with Saffran, Aslin, & Newport (1997?) and their basic method has been used in a great deal of follow-up work (including some out of my lab…). People are presented with a sample of MAL input for some (usually, but not always, prespecified) amount of time and then are later tested on what they know. Testing usually involves judging items that are or are not consistent with the patterns in the input language. It might seem that this specific MAL is well known enough that methodological details beyond question or theory driven adjustments can be dealt with by simply citing the SAN paper.  But, it turns out that some seemingly irrelevant methodological differences might be important to learning outcomes (plug for research by my student). Meaning that at this point we shouldn’t simply leave out methodological details from these kinds of MAL studies.

Most of my MAL work (I do other things too) investigates very different questions and uses much more complex artificial languages; the words mean something, they are presented in sentences alongside video clips, and participants are asked to produce novel sentences (i.e., sentences they didn’t get in their exposure). They are also sometimes asked to make judgments about novel sentences that are or are not consistent with the patterns in their input. The specifics of the language design are important, as are the specifics of the judgment task test items that are inconsistent with the patterns in the input. That is, the ‘ungrammatical’ MAL can sentences tell us different things depending on why or how they are ungrammatical. The specifics of the design are very important in these studies: If the language or the test items are not designed properly, the study won’t test what it is supposed to test. Thus, a thorough description of the methods is very important for readers (and reviewers!) to be able to assess the results and conclusions based on them in any MAL research, let alone replicate them.

The MALs used by SAN and related work are simple enough that it takes relatively little space to describe them well. However, the more complex languages I use in most of my work take a great deal more. Thus, the method sections in these papers are long if they (the methods) are well described. I (and others) tend to use base languages that I tweak as necessary to ask related questions. That means that there are multiple papers using very similar methods. It might seem then that I could simply refer back to the earliest paper for the basics of the methods and just explain any differences or deviations from the original in the new paper. But then the reader, or reviewer, could not actually assess the later papers on the basis of what is actually in that paper. As a reviewer, I hate it when I cannot assess a paper on the basis of what is in the paper. Don’t make me go look somewhere else to figure out whether what you did makes sense. So I am left with essentially repeating a great deal of content from one paper to the next. (Before you accuse me of salami-slicing, I don’t. These are papers asking related but different questions about a particular phenomenon and so where using very similar methods makes sense.) What to do?

Many of the tweets I received in response to my original tweet were telling me to go ahead and copy, being sure to cite the original, per COPE’s guidelines.

Let’s look at those guidelines (which the journal I am planning on submitting the paper in question to is a member of).

I downloaded a copy from the following website https://publicationethics.org/files/Web_A29298_COPE_Text_Recycling.pdf on June 13, 2017. I will inset any quotations from those guidelines to make clear which text is not mine in what follows.

These guidelines are intended to guide editors when dealing with cases of text recycling.

Text recycling, also known as self-plagiarism, occurs when sections of the same text appear (usually un-attributed) in more than one of an author’s own publications. The term ‘text recycling’ has been chosen to differentiate from ‘true’ plagiarism (i.e. when another author’s words or ideas have been used, usually without attribution).

A separate issue, not to be confused with text recycling, is redundant (duplicate) publication. Redundant (duplicate) publication generally denotes a larger problem of repeated publication of data or ideas, often with at least one author in common. This is outside the scope of these guidelines and is covered elsewhere.

Notice that is says “usually un-attributed”, suggesting that simply citing the appropriate original source does not necessarily make it not text-recycling. Moving on…

How can editors deal with text recycling?

Editors should consider each case of text recycling on an individual basis as the ‘significance’ of the overlap, and therefore the most appropriate course of action, will depend on a number of factors.

Significance isn’t defined, and the factors that are discussed don’t really make significance any clearer (to me). Shortly thereafter it says this:

In general terms, editors should consider how much text is recycled. The reuse of a few sentences is clearly different to the verbatim reuse of several paragraphs of text, although large amounts of text recycled in the methods might be more acceptable than a similar amount recycled in the discussion.

In my work, it is more than a few sentences, an even ‘several paragraphs’ is pushing it. Clearly, reuse in methods sections is seen as being different, but even there, editors are being counseled to attend to the amount of repeated text. But what exactly counts as ‘large amounts’ that ‘might be more acceptable’ – and notice that it doesn’t say ‘acceptable’, it says ‘more acceptable’. More acceptable can still be unacceptable. So far, clear as mud. The guidelines highlight the editors’ discretion, which means that they can be applied differently by different editors. And can result in serious consequences for authors.

Text recycling may be discovered in a submitted manuscript by editors or reviewers, or by the use of plagiarism detection software (e.g. CrossCheck). If overlap is considered minor, action may not be necessary or the authors may be asked to re-write overlapping sections and cite their previous article(s) if they have not done so.

More significant overlap may result in rejection of the manuscript. Where the overlap includes data, editors should handle cases according to the COPE flowchart for dealing with suspected redundant publication in a submitted manuscript. Editors should ensure that they clearly communicate the reason for rejection to the authors.

This says may be asked to rewrite and cite (if they haven’t already), again, saying that just having cited yourself is not enough, it shouldn’t be the same text (i.e., it should have been rewritten).

 

And from the guidelines published on the web by the journal’s publisher (Taylor & Francis) (copied text is again inset and is from the following website: http://authorservices.taylorandfrancis.com/ethics-for-authors/ (text copied below retrieved June 13, 2017):

Case 2: Plagiarism

“When somebody presents the work of others (data, words or theories) as if they were his/her own and without proper acknowledgment.” Committee of Publications Ethics (COPE)

When citing others’ (or your own) previous work, please ensure you have:

  • Clearly marked quoted verbatim text from another source with quotation marks.

According to this, it might be fine if I just enclosed the pages (yes pages) in question inside quotation marks. But pages and pages of quotations (even from my own work) seems excessive.

Shortly after that section is the following one (same website, same date of retrieval, copied text is again inset to make clear it is copied and not mine):

Make sure you avoid self-plagiarism

Self-plagiarism is the redundant reuse of your own work, usually without proper citation. It creates repetition in the academic literature and can skew meta-analyses if the same sets of data are published multiple times as “new” data. If you’re discussing your own previous work, make sure you cite it.

Taylor & Francis uses CrossCheck to screen for unoriginal material. Authors submitting to a Taylor & Francis journal should be aware that their paper may be submitted to CrossCheck at any point during the peer-review or production process.

Any allegations of plagiarism or self-plagiarism made to a journal will be investigated by the editor of the journal and Taylor & Francis. If the allegations appear to be founded, all named authors of the paper will be contacted and an explanation of the overlapping material will be requested. Journal Editorial Board members may be contacted to assist in further evaluation of the paper and allegations. If the explanation is not satisfactory, the submission will be rejected, and no future submissions may be accepted (at our discretion).

Note that the first sentence says ‘usually without proper citation’ not ‘without proper citation’. That means that even including a citation does not by itself clear you of self-plagiarism. It also does not distinguish methods sections from other sections of the paper. (As a language researcher I tend to notice these wording choices as well as words that are missing. Unless I’m editing my own work, in which case I am quite likely to miss missing words, make bad wording choices, etc.)

 

I found a paper in Biochemia Medica discussing this issue with a bit more clarity. (https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3900061/) The paper is attempting to make potential editorial policies regarding different kinds of self-plagiarism.

I will highlight a few sections from the paper. (Šupak-Smolčić, V., & Bilić-Zulle, L. (2013). How do we handle self-plagiarism in submitted manuscripts? Biochemia Medica, 23(2), 150–153. http://doi.org/10.11613/BM.2013.019)

In most cases of augmented manuscripts, the major overlap is seen within the methods section. As such, editors and readers can be misled to consider it as technical (self) plagiarism, which is usually not sanctioned with the same strictness as plagiarism of other parts of the paper. Nevertheless, if a submitted manuscript shows substantial overlap in the methods section with the author’s previous work, then the editor can consider this manuscript for publication only under the following circumstances:

  • the author refers to his previous work,
  • methods cannot be written in any other form without altering comprehensibility,

Although this section was about papers that reuse data, there is a relevant (imo) bit of text here: ‘methods cannot be written in any other form without altering comprehensibility’. This suggests that if they can be rewritten they should.

Later it seems to suggest that some overlap in methods might be OK, again at the discretion of the editor. But given the earlier passage just discussed, presumably, overlap is only deemed tolerable if unavoidable. In my paper, it is avoidable (as in, I can write it a different way, it’s just a hassle that is only being undertaken to avoid editorial hassles).

Based on the editorial policy of Biochemia Medica, upon detection of self-plagiarism, a submitted manuscript can be considered for publication only if it contains relevant new data and will contribute to overall scientific knowledge. Additional conditions have to be met:

When text similarity is observed with an author’s previous publication, and the original publication is cited, the submitted manuscript has to be revised, with the questionable parts corrected. Overlaps within the methods section can be tolerated, but the cut-off percentage is for the editor to decide. Similarities in the introduction section can be approached differently from the treatment of overlaps in the discussion and conclusion sections.

 

In case you think that this is silly and no one will ever face any consequences for text recycling: http://www.ithenticate.com/plagiarism-detection-blog/bid/94140/The-Challenge-of-Repeating-Methods-While-Avoiding-Plagiarism#.WUAFon0bjeQ (or search replies to my tweet to find the person whose paper got (desk?) rejected for this.

I’m not trying to pick on COPE or Taylot & Francis, I’m trying to lay out why it might not be as easy as the ‘just copy and cite’ advice I was getting. My suspicion is that that advice came from people working in very different fields with little appreciation for the nature of methods in other areas (and so why it might not be so easy for other researchers). We can have a discussion about whether these guidelines are reasonable, in fact, I think it would be good to do so. But I don’t see a way to come up with a one-size-fits all approach to this precisely because of the differences in methods. For now, I think I’ll stick with reworking my methods sections as best I can while still including all of the relevant details, because I think that methods are important for evaluation, making people go elsewhere to read them is bad, and I don’t want to get dinged by checkers for too much overlapping text. And I think that this is probably true for most people in my field. Other fields are likely quite different in terms of how much specificity is really required. Moreover, I want people to know what I actually did! Too often people think you did something you didn’t do, and then make claims about your work that are incorrect. If I provide details in the papers, they have less of an excuse for that. (Same goes for me and other people’s work – I often go back to a paper thinking they did something but finding out I was wrong. If the details aren’t there, it’s harder to do.)

 

(There is also the question of who actually ‘owns’ the words an author might wish to reuse. Aside from the copyright issues with many journal publications – often authors do not retain it, the journal does – if the original paper was co-authored, the words in question don’t really just ‘belong’ to a single author, and so are they really theirs to do with as they wish? I don’t know the answer to this, but it’s interesting to think about.)

 

The Language and Learning Lab is at #EvoLang11!

Oksana Tkachman will be in New Orleans at EvoLang 2016 presenting our (Tkachman & Hudson Kam) poster “Arbitrariness of Iconicity: The Sources (and Forces) of (Dis)similarities in Iconic Representations”.

It’s a reporting of our initial findings on a really cool new project Oksana is running. Here’s a brief description of what the study is about:

“Our study investigates factors that might lead to favoring some features of
referents over others in iconic representations. We investigate this by having
hearing, sign-naïve adult participants invent gestured names for easily
recognizable objects. The items participants were asked to create signs for
differed along a number of dimensions that we hypothesize might influence the
nature of the iconic representation, as shown in Figure 1. For instance, some of
the items were man-made while others were part of the natural world, as it has
been claimed that man-made objects are represented with handling (grasping)
handshapes (Padden et al., 2013). We also investigated the effect of movement
and size, for both man-made and natural categories. We anticipated that these categories would have impact on the choice of representational features; for example, the size and shape of natural objects would be encoded in the gestures, and the man-made objects would be represented by the prototypical interaction of humans with those objects.”

If you want to know what we found, go see Oksana present the poster! (Or just email either of us for a copy. Oksana: tox.cs84@gmail.com, or Carla.HudsonKam@ubc.ca)

 

“The impact of conditioning variables on the acquisition of variation in adult and child learners” just out in Language

I’m happy to be able to say that “The impact of conditioning variables on the acquisition of variation in adult and child learners” in now out in the recent issue of Language. (Note, it’s not open access.)

Abstract: “Natural human languages often contain variation (sociolinguistic or Labovian variation) that is passed from one generation of speakers to the next, but studies of acquisition have largely ignored this, instead focusing on aspects of language that are more deterministic. Theories of acquisition, however, must be able to account for both. This article examines variation from the perspective of the statistical learning framework and explores features of variation that contribute to learnability. In particular, it explores whether conditioning variables (i.e. where the pattern of variation is slightly different in different contexts) lead to better learning of variation as compared to when there are no conditioning variables, despite the former being conceptually more difficult. Data from two experiments show that adult learners are fairly good at learning patterns of both conditioned and unconditioned variation, the latter result replicating earlier studies. Five-to-seven-year old children, in contrast, had different learning outcomes for conditioned versus unconditioned variation, with fewer children regularizing or imposing deterministic patterns on the conditioned variation. However, the children who did not impose deterministic patterns did not  necessarily acquire the variation patterns the adults did.”

 

One more push on our internet survey on book reading – and some news (hint: it’s about data sharing)

I’m going to try one more big push for responses to our study on children’s bookreading. For those of you who don’t know (or have forgotten), I’m conducting a study in collaboration with Lisa Matthewson on aspects of children’s books. But before we can examine the books, we want to know which books children are being read most often. It turns out that although there is a large literature on book reading, mostly focused on what kinds of books or book reading practices seem to be related to various aspects of development (often linguistic but not always), most analyses either don’t rest on the specifics of the books or the analyzed books are chosen based on things like sales records. And much of the literature focuses on children who are older than the age we are most interested in. For all of these reasons, we decided to start by collecting information on the books children aged 0-36 months are being read most often. We are doing this via an internet survey that asks parents and caregivers some questions about the books they are reading most often to their children (in English). We have about 700 responses so far, but would like to get it to over 1000 if we can.

If you can help us get this survey out to more people, by posting it on your facebook page, tweeting a link, etc., we’d really appreciate your help. And in some exciting news, we just got permission to share the eventual data set (which does not include any identifying information about participants, so respondents don’t need to worry about anyone knowing who they are). If you are someone who is interested in children’s books, just think about how more data will be better for you too! So if you know parents of children aged 0-36 months who read books in English to their children, or have a way to get this survey out to some, any and all help is appreciated. There are no restrictions on country, monolingual vs. multilingual, or anything like that. Here’s a link to the invitation page.

Thanks for any and all help getting this out one last time.

Parents of children aged 0-36 months needed for internet survey on children’s books in English

The Language and Learning Lab at the University of British Columbia is looking for parents of children aged 0-36 months of age to participate in an internet survey regarding book reading. The survey takes approximately 10-15 minutes. If you are interested in participating, please click here.
Thanks!

(Feel free to pass this message along to parent friends.)

Just published: Learning language with the wrong neural scaffolding: the cost of neural commitment to sounds.

I’m excited to finally be able to announce a publication on the blog! It’s a paper entitled “Learning language with the wrong neural scaffolding: the cost of neural commitment to sounds” that just came out in Frontiers in Systems Neuroscience. It’s part of a special issue on sensitive periods in development. So happy to see this work finally come out. Congratulations to first author Amy Finn!

It’s my first foray into open access publishing, and it was a great experience.

Abstract is here: Does tuning to one’s native language explain the “sensitive period” for language learning? We explore the idea that tuning to (or becoming more selective for) the properties of one’s native-language could result in being less open (or plastic) for tuning to the properties of a new language. To explore how this might lead to the sensitive period for grammar learning, we ask if tuning to an earlier-learned aspect of language (sound structure) has an impact on the neural representation of a later-learned aspect (grammar). English-speaking adults learned one of two miniature artificial languages (MALs) over 4 days in the lab. Compared to English, both languages had novel grammar, but only one was comprised of novel sounds. After learning a language, participants were scanned while judging the grammaticality of sentences. Judgments were performed for the newly learned language and English. Learners of the similar-sounds language recruited regions that overlapped more with English. Learners of the distinct-sounds language, however, recruited the Superior Temporal Gyrus (STG) to a greater extent, which was coactive with the Inferior Frontal Gyrus (IFG). Across learners, recruitment of IFG (but not STG) predicted both learning success in tests conducted prior to the scan and grammatical judgment ability during the scan. Data suggest that adults’ difficulty learning language, especially grammar, could be due, at least in part, to the neural commitments they have made to the lower level linguistic components of their native language.”

How we do things around here – or the importance of thinking about the hows and whys

“I try to think about the mechanism as much as possible – it’s the very careful and most systematic “how” explanation that we should always have in mind. Still, if I see this word nonchalantly thrown around in another review of my work, I will scream. It’s important. Fundamental. So fundamental, that it should motivate the design of experiments and theory guiding them, not the speculative post-hoc interpretations we make (at least not out loud right?). Mechanism is becoming the word used by (perhaps lazy?) reviewers who don’t have anything specific to say. Ironic. That, or perhaps I’ve just been reading way too many reviews these last 2 weeks and I need to thicken up…”
– Amy Finn, PhD^

I know I said that this blog was going to be used mostly to post updates on the goings on in the lab, and this post won’t fit with that theme. But it touches on something that I’ve been thinking about a lot lately, while introducing you to the way I think, and the way I try to train grad students to think.

I think about theory, a lot. Everything I do is done with theory in mind. By theory, I don’t necessarily mean big grand theory, I mean, more specific concrete ideas about how things work (the mechanism Amy refers to), how different aspects of cognition are related to each other, and how my findings fit in with what we already know. I don’t do ‘cute’ studies that are interesting ‘just because’. I always want my work to tell us something, something bigger. This means that I think. A lot.

This makes me slower than many other researchers. I don’t jump into things quickly. And I don’t tend to write quickly either. It also means that I frustrate students. When a student comes to me with an idea for a study, especially early in their relationship with me, I usually stop them from explaining it to me part way through and ask why, why they would do whatever it is they are proposing to do. What will it tell us about anything other than the particular experiment they are describing? Usually they don’t have much of an answer. That’s why I encourage students to start with questions, not ideas for studies. Armed with a question, we can design a study to answer that question. And then we sit and think about what positive and negative results would mean, what are the possible confounds, other interpretations, and think about what the follow up studies should be given different sets of results. I almost never design one experiment at a time. And I never think about tweaking variables in a study just because they are there to be tweaked. I always want a reason, a bigger picture reason, for manipulating a variable. That’s not to say that ‘little’ variables (like ISI) aren’t important. In fact, we’re finding out in an ongoing study by Alexis Black that it (ISI) is. We found that out by accident though, and we’re now investigating it purposely, with informed ideas about why it has the effect it seems to have. Ideas that might be wrong. (Stay tuned to the blog for more on this in the near future.)

So in general, my approach to graduate training is to help students learn to think in a certain way, not to think certain things. Specifically, I want them to leave the lab approaching research in a certain way. To think big thoughts, even about ‘small’ things. The conclusions they come to, and the theories they espouse might be different than they are for me. That’s as it should be. And I try to remain open enough myself that I can learn from them as well. I’m not sure how successful I am at all of this, but it seems to be working OK. On the latter point, my own interests have been demonstrably affected by my students (see my continuing interest in gesture, for instance, which is all due to the influence of Whitney Goodrich Smith). On the former, I am heartened by the recent facebook post by Amy Finn that was the quote that lead this post.

But I am also frustrated. Frustrated for her, as I seem to have made her life much more difficult by encouraging her to think this way. Frustrated that this is so far from the standard way of working that reviewers don’t believe you when you say that contrasts are planned. And ask you to examine your data every which way, with no theoretical basis, while simultaneously chastising you for doing too many comparisons. (How doubling the number of statistical tests is a solution for too many to start with is beyond me.) And who encourage you to remove non-significant findings from a paper. You know what, I include carefully controlled variables in a study because there was reason to believe that they would affect outcomes. Sometimes I am wrong. And I think it is worth knowing that I am wrong. Especially when results from other related studies would suggest otherwise. (I have been able to include some null results before, see e.g., *Hudson Kam, 2009, so public records of my incorrect ideas do exist. But not enough.) But as we all know, null results are notoriously hard to publish. This is a topic that has been much discussed lately. And people are pushing better stats as a way to fix the problem. But that is only part of the solution. It seems to me that situating work within theoretical issues and questions that are specific enough to be meaningful (not just of the ‘hey, are these two things related’ variety) is another crucial part of the solution. But of course, this only works if we also know about failures. And understand them.

What is the point of this post? Well, it is two-fold. One, to announce that we will do blog postings of failed experiments and conditions on the blog so that there is a more public record of them, from my lab at least. Two, to make a point about the importance of thinking from a theoretical perspective. And as part of this, to inform people of how we do things in our lab. To clearly state that if you encounter a student who has worked with me, you can ask them to justify their work, to put it in a broader framework. I can assure you, they have thought about it. (And to warn people who might be interested in working with me about this. My way of working works for some people, but not for others.) We don’t have all the answers here. (If we did, I’d be out of a job.) But we’re really good at questions.

 

^quote used by permission

*Hudson Kam, C.L. (2009). More than words: Adults learn probabilities over categories and relationships between them. Language Learning and Development, 5, 115-145.