COLM
I’m in the last moments of my first academic conference, COLM, the conference on language modeling. Since last fall I’ve been working with a lab at the University of Illinois that focuses on constrained generation. I wrote a paper about the relationship between byte-level tokenizers and UTF-8 that was accepted for a poster at this conference (tl;dr: Unicode is very a leaky abstraction that doesn’t play nice when you manipulate it at the byte level) and came here to present it in the poster session. I’ve met several heroes and colleagues and hopefully not humiliated myself too much. I’ve got some disparate observations I may find order for in the end, but for now I’ll write them out independently.
- I’ve never been to a conference like this before, but I have been to conventions which are, to a certain extent, similar. The difference I’ve detected is that attendees at a conference are professional colleagues, whereas conventions are more social and friendly. I made the mistake of referring to Alex Lew as a “conference friend” in front of Jason Eisner, to which Jason very kindly said “well I’ll be another conference friend” or something equivalent. But that’s not really the impact I’d like to make.
- I’m not sure what the etiquette around conversations is. One of the reasons these events happen, it seems, is to get people in a room together talking to one another. But people often end up in the middle of conversations into which it’s difficult to insert oneself, so one stands awkwardly waiting for a moment to be attended to. I walked away from a lot of posters that I’d have liked to know more about because I was simply ignored. The conversation with Jason and Alex was an example: I knew Alex from dropping in on lunch with him and some colleagues, who were kind enough to let me join up with their group when I ran into them on the street and recognized their conference name badges, and I wanted to introduce myself to Jason. But they were in the midst of something, and I just stood there for what seemed like several minutes until they reached a point where they noticed me. It was strange: I could tell they knew I was there because of the way they moved around in response to my presence in space, but they didn’t look at or address me until Alex finally turned and greeted me. I don’t know what I should have done in the circumstance, but I’ve been far less aggressive about talking to people since.
- This is about the hottest room in AI right now, but people are surprisingly bearish: there’s a lot of progress to be made still, and we’re far from a general solution to artificial intelligence. I don’t feel it’s worth it to report on trends, because I don’t feel that I have a grasp of all the trends. Things change very quickly, and there’s a lot of simultaneous reinvention; problems are being discovered all the time and there’s little theoretical basis for progress: most of what’s being done is purely empirical. The research program of artificial intelligence has always been defined quite vaguely: to program computers to do things that are normally regarded as signs of intelligence when humans do those things. We’re still playing in and exploring the space.
- This is exactly where I wanted to be when I set out in the spring of 2021 to learn everything I could about language models. I now know enough to engage intelligently at a conference like this as a participant and contributor. Obviously I’m not a great expert, but I don’t feel as if I’m being completely left in the dust and unable to follow what’s going on. Now that I’m here, though, I don’t know whether this is what I want. I’m used to using the statement that something isn’t what I want as a hedge against not being able to get it: if I can’t get it, I can make myself feel better by convincing myself I never wanted it anyway. But when I decide that I actually don’t want something, then I doubt myself and wonder whether I’m just making excuses to justify not trying harder or whatever. And if I don’t want this, what do I want?
- I’ve been struggling with doing things lately; I’ve been feeling lazy and as if my projects are pointless. I like programming for its own sake but have been struggling to keep up with writing. I’ve been working on the yoga sutras and using them as an occasion to study sanskrit. The yoga teacher training has been working my body out. I worry that I’m not recovering terribly quickly from this concussion, but my memory of dreams is improving and my mental fatigue is getting less bad. I’ve been struggling at this conference, but I hardly slept at all either the night before or the night after the first day and the exhaustion got to me. I’m just trying to rest and take some care of myself. I’ve been breaking out like hell, especially on my forehead, since starting the yoga teacher training: I think it’s all those times I rubbed my forehead on the mat to “give my third eye a massage” during class.
- People have been generally very kind and encouraging and giving of their time. I feel quite welcomed, in the end, and that there are clear prospects for future work. I’m a newcomer to the scene and have a lot to learn, but there’s much more to be done both in the discipline itself and for my own edification. I want to continue working with the lab at UIUC, if they’ll have me, and I’m attending a meeting Monday for work on a survey paper about tokenization with people I’ve met at the conference. I feel much more like a real scientist than I did before coming here. Ari Holtzman had a wonderful conversation with me after a very kind student of his introduced us. Ari offered some unsolicited advice: don’t go to graduate school, because the academic-industrial complex is morally bankrupt. I appreciate his candor.
- I’m feeling Peter Turchin’s elite overproduction at play here: somehow both demand for expertise and competition for positions are increasing. If there is still so much work to be done, how come there are so many layoffs? How come hiring is so slow? It could be purely my perception, and it could be that expertise in artificial intelligence will be more protected than other computer expertise. But there’s also a sense of urgency, or rush, of competition for limited resources. There are only so many seats in so many labs.
- I have the experience necessary to qualify me for a PhD. program. The decision of where to go will be a more abstract one of fit and aptitude: how I think about problems and the kinds of problems I want to solve. But I don’t know that I want to do a PhD. program in this discipline, and I’m not sure what the problems I want to solve are. I was talking to my friend Zach over my birthday, and he said that his next career move will be motivated by following interesting problems. I have a surplus of options: there are so many interesting challenges to address and fun things to try. The hard thing is deciding what to focus on. For now, tokenizers and constrained generation are a cool space to work in, but I’d like to eventually work on other things too. I think you’re able to branch out in your career, which is good.
I don’t think I’m going to apply for a PhD. program that starts next fall. Partially because I don’t want to do the application, partially because I feel that my resume isn’t impressive enough yet, and partially because I want to keep traveling for a good while. Honestly, I don’t want to pass more of the young part of my life working in an office at a computer and going to conferences like these. I still have to learn some conceptual basics (lots of math I’m short on) before I feel ready to make my move. And I’m happy with the amount of projects I’ve been working on: I like being the independent researcher; showing up and being known for doing good work is what’s really important. And now I have a Google scholar profile, which is a trip. But I’d rather spend my youth out doing the fun part of being retired, of going places, of doing things.
- Montreal was a good place to have the conference; the facilities were reasonably priced, well-positioned, and accessible. Montreal has many unhoused people in states of physical impedence: I saw one man walking bent over horizontally at the waist; another supported himself on a crooked old cane. THere was a protest of thousands against the ongoing genocide in Palestine the second night.
Now that I’m back in San Francisco I’m observing the same phenomena on the streets here, along with the high-tech advertising. That’s a subject for another time.