Making a Twitter bot for Clarice Lispector

If I have to doomscroll, I might as well make it enjoyable

Liking Clarice Lispector's writing isn't enough, reading all her books isn't enough, I need to be constantly feeding her words into my veins.

After devouring her books over the winter, I realized that I was severely missing her writing and her daily voice in my head.

To make up for that, I went through a phase of unearthing any other Clarice deep cuts I could find–historical articles (some lovely posts about her in the archive here), videos, research papers(here, here), rereading some of her short stories, and even a brief foray into philately.

I briefly considered rereading her books but decided against it (for now) because a) I didn't want to grow tired of her, b) it was too soon; having just finished her oeuvre (and because I believe books need to be partially forgotten before you can revisit them) and c) I also wanted to get through the other books in my TBR pile. "How can I still serendipitously come across her writing each day, and with the least amount of friction?" I asked myself. The answer was, and I'm not proud to admit this, by having her show up on something I was reading everyday: on my Twitter feed.

I wanted a way to be exposed to her writing organically, coming across her excerpts in bits and pieces everyday, since almost every sentence of hers seems to be quotable and underlineable. These exposures also had to be short enough since I wasn't sitting down to read a book of hers. Incidentally, a tweet seemed like the perfect length.

It was also a nice excuse since I always wanted to know how to make a bot. I thought the market would be quite saturated with Clarice bots already, but to my surprise there weren't any. There were a couple in Portuguese, and even those weren't quite regular with their posting, so this seemed like the perfect opportunity.

So you want to make a Twitter bot

Step 1: Gathering the material to post

Choosing a body of work to quote

The first step was to find the actual material that the bot should post. It was difficult to find digital versions of her books for free, especially in English. I had her entire collection in paperback, and I wasn't about to purchase it all again digitally (even then, there's no guarantee you can use the text from them outside of their proprietary file types). So I sailed the high seas finding some high-quality copies, which almost always means .epubs or .txts because they're much easier for the computer to parse. I mostly found .epubs though, so I used a free online tool to convert them to .txts to make them as plain to parse as possible. (Side note, some of these copies were by a different translator than I read, and I preferred my version over them, but this was the best I could do.) As an added measure, I manually deleted everything apart from the actual text of the work–this included table of contents, introductions, forewords etc., to prevent the possibility of those sentences being posted by the bot.

Curating the whole text for quotable chunks

I knew from the outset that this had to encompass her entire body of work, since I wanted not just the greatest hits, but everything in between. I know that some bots use a text file wtih a curated list of quotes on each line, but this would get repetitive fast, and I did not want to be the arbiter of what constituted a "good" quote here. Luckily for me, I didn't have to further curate her pieces into quotable and postable chunks, since almost every sentence of hers lends itself to be quite profound. Plus, even a mundane sentence can sound profound when you read it from a "curated" bot post (beauty lying in the eye of the beholder and all that).

Step 2: The technicals

The Algorithm

Selecting a book and sentence to quote

How did I decide what quote to post and from where and how often? Randomly. I placed all the .txt versions of the books in a folder named "Texts". Each time the bot posts, it picks a book randomly from that folder, and posts a sentence or paragraph from there that fits in the 280-character limit for Twitter.

Ensuring the quoted chunks are legible

To make sure that the posts are legible, I followed the following steps:

Split the text of the book by paragraph (by checking for new lines)
For each paragraph, split on periods "." to define sentences within each paragraph
If the paragraph is a single-sentence paragraph and is not too short (greater than two words), post it.
If the paragraph has multiple sentences, it combines two or more sentences as long as they fit within the 280 character limit.
When posting multi-sentence posts, if the sentences happens to be the first sentence of the paragraph, always take the next sentence. And if not, then it can take either the previous sentence or the next sentence to add up to 280 characters.

Example of selecting a book, and a single sentence within it.

Example of selecting a book, and two sentences within the paragraph.

Automating the posting

I used GitHub Actions for this one. I made a .github/workflows folder and added a yml file inside of it with the following configuration:

name: Clarice Lispector Bot

on:
  schedule:
    - cron: '0 * * * *'  # Runs at the top of every hour
    #- cron: '30 * * * *' # Runs at the 30th minute of every hour
  workflow_dispatch: # Allow manual runs

jobs:
  tweet:
    runs-on: ubuntu-latest

    steps:
    - name: Checkout code
      uses: actions/checkout@v3

    - name: Set up Python
      uses: actions/setup-python@v4
      with:
        python-version: '3.9'

    - name: Install dependencies
      run: |
        python -m pip install --upgrade pip
        pip install -r requirements.txt

    - name: Run the bot
      env:
        API_KEY: $
        API_SECRET: $
        ACCESS_TOKEN: $
        ACCESS_TOKEN_SECRET: $
      run: python main.py

    - name: Commit and push posted_tweets.csv
      run: |
        git config --global user.name "github-actions[bot]"
        git config --global user.email "github-actions[bot]@users.noreply.github.com"
        git add posted_tweets.csv
        git diff --cached --quiet || git commit -m "Update posted_tweets.csv with new tweets"
        git push
      env:
        GITHUB_TOKEN: $

Some notes:

You can see that I commented out the original cron job that posted every half hour. This was because the Twitter API rate limits kept timing out when posting this frequently, even though I was sure it shouldn't be the case after checking the docs. I spent way too much time figuring this out and decided to just have the bot post every hour instead (which already seems plenty now that I look back). If someone knows why this happens, feel free to let me know.
The configuration is something I found online, but it instructs GitHub Actions to run the main.py file each time along with the required dependencies and API keys. It also pushes metadata back to the repo about the tweet posted, which I'll cover in the next section.

Logging and metadata

I wanted to know, if I came across a really nice quote, which book it was quoted from. Each time the bot runs, it appends a row to the posted_tweets.csv file, which includes the tweet, the timestamp of the post, and the book it was posted from. So I only need to check the file and search for the text I want. It works pretty well for my use case and this way I also have a database over time for everything posted.

Results: How is the bot doing today?

This was a medium-term project in the sense that I would be able to gauge its success after a few months on the following aspects:

Did it post regularly the whole time?

Yes! It's working quite well and still posts to this day with great regularity. There are a few times when the tweets don't go through thanks to the rate-limiting of the API, but those instances are rare, so I'm not too worried about finding a solution to it. To date (at the time of posting this), since the inception of this bot in 16-12-2024, there have been 5140 posts, averaging 21.67 posts a day (so not quite one post every hour apparently).

Did it repeat a lot of its posts?

No, actually. I was able to check the numbers for this thanks to the metadata that the bot logs. There have been 4344 unique tweets; so some repetition out there, but it's a feature not a big in my opinion. In fact, the most repeated tweet has only been repeated 5 times.

The selection of books was also nicely spread out. (Note that I added the last 4 books a few months after the first 5, which is why they have been quoted so much more.)

Did I continue to enjoy reading it?

No numbers to back this up, but I'm happy to report that the answer is yes! I did wonder at the start if I'd grow bored by the tweets, but to my pleasant surprise, I still pause and read them each time (mostly) I see a post on my feed, which already makes this project a success in my book.

Did people resonate with it?

While this was never made with an audience in mind (I never expected more than a handful of people to actually follow the bot), it found its own passionate set of people engaging with the bot.

It has ~2000 followers currently, and at any given time, I receive notifications for people engaging with it, liking, replying, quote-tweeting, which is always so nice to see.

Random Example 1:

This one was especially nice to see. I don't remember exactly where I came across Clarice for the first time, but I remember seeing her for the first time in print in the opening to Kaveh Akbar's Martyr!, and had the same thoughts as this user.

Random Example 2:

You can find the bot here: twitter.com/c_lispector_bot

So you want to make a Twitter bot #

Step 1: Gathering the material to post #

Choosing a body of work to quote #

Curating the whole text for quotable chunks #

Step 2: The technicals #

The Algorithm #

Selecting a book and sentence to quote #

Ensuring the quoted chunks are legible #

Automating the posting #

Logging and metadata #

Results: How is the bot doing today? #

Did it post regularly the whole time? #

Did it repeat a lot of its posts? #

Did I continue to enjoy reading it? #

Did people resonate with it? #