What major works of literature were written after age of 85? 75? 65? (statmodeling.stat.columbia.edu)

by paulpauper 98 comments 151 points
Read article View on HN

98 comments

[−] FelipeCortez 45d ago
This is actually a good fit for a Wikidata SPARQL query you can run here https://query.wikidata.org/:

  SELECT ?work ?workLabel ?author ?authorLabel ?publicationDate ?ageAtPublication
  WHERE {
    ?author wdt:P569 ?birth .
    ?author wdt:P570 ?death .
    ?author wdt:P800 ?work .
  
    ?work wdt:P50 ?author ;
          wdt:P31 wd:Q47461344 ;
          wdt:P577 ?publicationDate .
  
    FILTER(?publicationDate <= ?death)
  
    BIND(YEAR(?publicationDate) - YEAR(?birth) AS ?ageAtPublication)
    FILTER(?ageAtPublication > 60)
  
    SERVICE wikibase:label { bd:serviceParam wikibase:language "en". }
  }
  ORDER BY DESC(?ageAtPublication)
  LIMIT 300
[−] andai 45d ago
How can I learn more about this? I looked into it recently but didn't get very far.

This seems like the kind of thing that should be more widely known, and have some good tutorials written for it :)

[−] FelipeCortez 45d ago
The Wikidata documentation is good:

https://www.wikidata.org/wiki/Wikidata:Introduction

And you can find lots of SPARQL examples here:

https://www.wikidata.org/wiki/Wikidata:SPARQL_query_service/...

[−] carefulfungi 45d ago
Wow - this is super cool. Thanks for sharing!
[−] CobrastanJorji 45d ago
This is a very cool query tool that I haven't seen before, thanks! (Also the syntax drives me a little batty).

I tried modifying it to give me authors whose first publication (any publication at all) happened after 60 years old, but also who had at least one wdt:P800 work. I got people like Cato the Elder, Josephus, and William of Tyre.

I tried again for only people born in the 20th century, and I got some results (plus quite a bit of wrong answers, presumably something about the query or data)! Oddly quite a few of the results are from criminals who wrote an autobiography after their release, including Henri Charrière and the infamous Nazi, Albert Speer.

[−] OfirMarom 45d ago
This is actually very awesome. Had no idea about this.
[−] tantalor 45d ago
Can you filter by "major works only"?
[−] FelipeCortez 45d ago
that's kind of what P800 (notable work) is doing, but you can try some approximations to "major work" with "has both an English Wikipedia page and a Goodreads link":

  ?work wdt:P50 ?author ;
        wdt:P577 ?publicationDate ; 
        wdt:P8383 ?goodreadsID .

  ?article schema:about ?work ;
           schema:isPartOf  ;
           schema:inLanguage "en" .
[−] throw4847285 45d ago
I don't think that's what they meant by major.
[−] par1970 45d ago
Why?
[−] wodenokoto 45d ago

> asked LLMs to compile list of 10-20 writers considered canon in each decade since 1800, then identify all their notable works and years of publication. After some iterations with coding agents I got over 2,000 works by 200 authors.

Wait, so the source data is just LLM hallucinations? It makes sense to use an LLM to build the data collection, but not to build your source data.

[−] cowboylowrez 45d ago
This is in my opinion a better use of tech that has an error rate (hallucination), you just assume that its a fuzzy search, and sample the results to see how you did. I'd like to see a few from the results for sure!
[−] dyauspitr 45d ago
LLMs cite. So hope they did their due diligence.
[−] ijk 45d ago
It feels a lot like storing your data as an essay in a Word doc instead of a spreadsheet. It can work and all of the math is probably correct, but it's very much the wrong tool when the structured data was right there to be used instead.
[−] dyauspitr 45d ago
The structure data is scattered all over the place. This does the very important thing of aggregating them, and bringing them together. If you had to manually do that it could take weeks.
[−] Ajedi32 45d ago
What do you mean by due diligence here? Manually checking 2000 citations sounds a lot harder to me than just pulling the data from a reliable source to start with.
[−] mellosouls 45d ago
I think this is pretty common across different creative forms albeit with different age ranges but constrained at the higher end.

So the greatest physics, maths, poetry and pop music are done by people in their 20s.

Literature (esp novels) seems to occupy an older range, perhaps 30s to 50s. Perhaps classical music and philosophy also? I don't know about the visual arts.

I interpret it as the former requiring the creative fireworks of youthful neural elasticity and the latter the depth we associate with lived experience and wisdom.

Naturally there are outliers (general relativity in Einstein's early 30s, Shakespeare word play till his late 40s) but I think in general these rules of thumb seem to be a good guide for the very highest achievers and for the most creative periods for us mere mortals.

Mediocrity of course is unconstrained by age.

[−] ralferoo 45d ago
(complete sidetrack)

I think this graph is a great illustration about how anonymising data is hard. It's very easy to isolate individual authors from this list, because there are clear diagonal lines because the year and age are increasing in lockstep. This also suggests there aren't actually that many authors in this collection, because of these strong diagonals everywhere.

There's probably also some erroneous data here with a bunch of points representing material written by people at age 34 between about 1920 and 1940 (an obvious horizontal line) when most of the rest of the graph doesn't show any strong horizontal bias for a specific age.

[−] rjtavares 45d ago
Opened it just to check if Saramago was there, and indeed, he is.

For most of his professional life he was a journalist. He published his second novel at 55, only found his narrative style at almost 60, then wrote 15 novels (and won a Nobel) after that. What an amazing career.

[−] keiferski 45d ago
It’s difficult to be a truly interesting person with a unique perspective on life, and have the skills to transmute that experience into a work of art, when you’re young. You simply haven’t logged the hours in the world, and I kind of don’t trust your opinion on something if you haven’t.

Not sure if I’d call him a major writer, but Raymond Chandler is one of my favorites and I think he’s a good example. To me there is a fundamental difference between his crime stories, which show the results of corporate life, alcoholism, personal tragedy, war, etc. and a more modern crime writer that’s just writing a genre piece with all the right pieces, but no actual personal experience.

[−] seanhunter 45d ago
Well the canonical example is Diana Athill who had a long and distinguished career at a literary editor for people Phillip Roth, John Updike, Margaret Atwood, Jack Kerouac and others, then retired at the age of 75 and started writing her own novels and memoirs and is considered one of the greatest writers in English of the 20th century. “After a funeral” is I think the one of hers I read and it’s amazing

https://en.wikipedia.org/wiki/Diana_Athill

[−] thinkingemote 45d ago
"The accepted notion is that age confers a spirit of reconciliation and serenity on late works, often expressed in terms of a miraculous transfiguration of reality....But what of artistic lateness not as harmony and resolution, but as intransigence, difficulty, and contradiction? What if age and ill health don’t produce serenity at all? "

Thoughts on Late Style by Edward Said https://www.edwardsaid.org/articles/thoughts-on-late-style/

[−] gmuslera 45d ago
Major=got popular enough? That doesn't need to be fully correlated to the quality of the work.
[−] arduanika 45d ago
George R. R. Martin completed his cycle "A Song of Ice and Fire" when he was...wait...I'll get back to you on this one.
[−] OtherShrezzing 45d ago
This is a disappointing statistical modelling technique.

The author asked LLMs to produce lists of data which are readily available on the likes of wikipedia. Author date of birth, list of publications, and publication release date are all fairly easy to get hold of. They just need formatted appropriately. The LLMs produced a few false positives, and missed out some prominent works.

I get that this is just the author working in public & writing about what they're up to, but the number of avoidable errors introduced by the methodology make reading it a poor use of time.

[−] latexr 45d ago

> In trying to come up with some good examples I asked LLMs. (…)

> So I tried to cast the net more broadly and asked LLMs (…)

> EDIT: also hunted down several mistakes, as one would expect from LLMs; thanks to commenters.

This is a slop post. You can’t trust any of the data. It’s baffling and worrying the author apparently understands mistakes from LLMs are to be expected but still decided to publish without doing due diligence.

[−] boznz 45d ago
For me my 60's was the best time to start writing fiction, before then I always had excuses why I would not write, now with much more free time, experience and no money worries, I can think back on all those thousands of novels I read, knowing I could write a better one. Writing is also one of the cheapest retirement hobbies you can have and you are also more likely to experiment across different genres as you are not pandering to an audience.
[−] NetMageSCW 45d ago
It feels like a natural result of life expectancy increasing over 70 (world wide average) only in 2021 and a number of years past publication being required for something to be deemed a major work means it is natural that there are few today. Something like 100%, 110%, and 120% if life expectancy at the author’s time of birth might be a more useful measure today.
[−] OJFord 45d ago

> Also interestingly, the trend in that graph keeps going up in recent years… but it looks to me like this is driven by lack of major works from young authors. It may be how my sample is constructed.

Isn't that because older authors have had more time to gain notoriety, their earlier works to be deemed 'major' in retrospect?

[−] shrubble 45d ago
Douglas Southall Freeman wrote the definitive biography of Robert E Lee over twenty years, publishing it when he was 49; he then went on to publish his seven volume biography on George Washington when he was 62 (he finished the sixth volume on the day he died; the seventh was completed by his research assistants).
[−] CobrastanJorji 45d ago
There are a suspicously large number of very straight diagonal lines on those graphs with identical slopes. I might predict that they are individual famous authors that released a lot of works, but the slopes are all identical. What's going on there?
[−] candlemas 45d ago
John Milton was 63 when Paradise Regained and Samson Agonistes were published.
[−] bethekidyouwant 45d ago
I feel like Cormac McCarthy famously took 20 years to write his novels so does it really count if you finished it when you were 72?
[−] lkm0 45d ago
Beyond the data science interest, isn't this sort of charting powered by the "my time's running out and I still haven't left my mark in history" intrusive thought? Purely from a fitting perspective I'd wager the correlation is close to zero, because "major works" will be different in a century, and again changed in two. Shakespeare was not very popular in the 17th per wikipedia. As George Orwell put it, it's much easier to write when you do it for a purpose that matters to you. Hugo wrote Notre-Dame mostly to rant about architecture; creating a major work for the purpose of staving off fears of being forgotten I feel is not enough in itself
[−] vismit2000 45d ago
Most of the literature by Srila Prabhupada used in most universities around the world was written well over the age of 75: https://prabhupadabooks.com/books
[−] ikidd 45d ago
That doesn't bode well for GRR Martin getting the last book done.
[−] apparatur 45d ago
covfefe was at ~71
[−] joeldg 45d ago
[dead]