Apollo

Apollo

Adding Dalle support to an open-source Discord bot. Includes fun with APIs, Docker, GitHub, and databases

Apollo is a Discord bot born in the fire and flames that is the University of Warwick Computing Society Discord server. Created in late 2017, the open-source (https://github.com/UWCS/apollo) Python bot has been worked on by 25 people as of writing. It serves various functions such as: interfacing with the Warwick University API to provide timetabling functions; keeping track of the reputation of anything and everything; and, most recently, ChatGPT capabilities.

Since gaining an OpenAI API key, one of our members coded a command that interfaced with the API to give Apollo GPT capabilities. Since we already had an OpenAI key, we thought: why not add Dalle support to Apollo? I decided to take on this task as I knew it would involve technologies such as APIs, git, and docker: utilities vital to most production software which I had not yet used. Thus my crusade began...

Initial Creation

I decided to approach this task by decomposing it into more manageable sub-tasks: mainly working on API, docker, and git separately. To isolate the API problem, I decided to set up my own discord bot. According to OpenAI's documentation, the command to generate an image using Dalle is simple:

response = openai.Image.create(
    prompt="a white siamese cat",
    n=1,
    size="1024x1024"
)
image_url = response['data'][0]['url']

It is then a simple matter of downloading the image from the URL, uploading it to Discord, and then sending a message with that as the file. This could be further improved by using the undocumented openai.Image.acreate function which is asynchronous: improving performance for anyone else using Apollo at the time as the OpenAI API is infamously slow. Now that I had the API sorted it was time to actually integrate it into Apollo

Git (being a git)

Now that I had the core feature of the command sorted it was time to port it over to Apollo, and here my problems started...

Whilst the initial port was fine, apart from one issue where a subroutine was not being awaited properly, actually getting a valid commit was fun.

UWCS tries to (most of the time) write code that is readable and looks half decent. As part of this effort, there is a GitHub action to require all code to pass both isort and black formatting checks. Because I'm dumb I was initially doing my development on Windows, thus both isort and black extensions in VSCode were broken. This resulted in me spending a couple days manually implementing isort and black formatting.

Whilst black was relatively easy (apart from removing unnecessary whitespace), isort was whole over game. Firstly the error messages from isort are profoundly unhelpful, merely saying "Imports are incorrectly sorted and/or formatted": very helpful. This was enough to prompt me to migrate all my development over to Linux and actually get isort and black working.

Once formatted was merged (after PR!), everyone liked it, and everyone lived happily ever after....right?

Pushing all the wrong buttons

After Dalle support was released there became 2 main features that needed to be implemented:

The first was easy, but the second was not. The first was easy as it was a simple matter of adding a cooldown to the command decorator (@commands.dynamic_cooldown()). The second was not as I wanted to do the regneration via Discord's new buttons and retain image history.

The ability to generate a variant was easy as there is openai.Image.avariant function in the API which simply requires the URL of an image. As Discord hosts all images with a unique URL, I just needed to pass that URL into openai and boom!: new image. To regenerate the image simply call the API with the same prompt and let Dalle do its thing.

Now came the time to actually be available to allow the user to select which image they wanted. Discord's new buttons are a bit of a pain to work with as the documentation surrounding them was very limited and I had to go off of pre-existing code.

The way you implement buttons is with a discord.ui.View class which contains a list of subroutines with a discord.ui.Button decorator. You initialise this class and then attach it to the message that you want the buttons to be applied to. You then pop whatever code you want to run when the button is pressed into the subroutine. I passed in the Dalle cog and made the acreate and avariant subroutines static methods so that I could call them from the view.

Battling with discord.py

Whilst adding the buttons was comparatively easy, having the audacity to want to maintain image history was painful.

The primary issue I faces was the embeds feature in discord.py being very weird. In the docs, there is an attribute to each message which retrieves all the attachments to a message. My solution was to get the list of attachments, use the last one (if required) to generate the image, then append the new message to the array, and finally send the message with the new attachments. This worked fine in theory, but in practice all appended attachments would have null:

null image

This issue was made particularly infuriating as every attempt to fix the bug cost money. Admittedly it was a tiny amount ($0.016 per image), but having a literal cost to all of my failures was not fun. I would eventually mitigate this by hard-setting the images to be the Bing and Google logos and not calling the API.

Eventually, I would fix the attachment bug upon discovery of the add_files() method that would append a file to a message. Not only did this fix the bug (kinda, more on that later), but it also made the code much cleaner. WOO!

Sinking with docker and databases

After Dalle support was fully implemented, some people decided to generate some .... fruity images with it, some even triggering Dalle's safety system. As each of these requests cost us money and could potentially cause our API key to be revoked, we decided to shut it down. To do this I decided to create a ban list for any openAI-powered command. This would require me to dabble in Apollo's database and the corresponding docker container.

Apollo is dockerised into two containers: one for the code; one for the (many) databases. Thus all database commands need to be run inside the container. This led to many issues and many frustrating moments.

Apollo does come with docs to migrate databases, however, the commands given did not account for Apollo being run in a container. I would run the command to generate the migration, and everything would appear fine but upon committing to Github I would be greeted with a 1.6k line error log. Not fun.

error log

After a couple hours of trying to fix this, I decided to just use the docker container as a glorified Python interpreter and run the migration inside. Then use docker exec apollo cp ... (yes that is the command Sammit) to extract the migration file from the container. This worked fine and managed to quash 1,600 lines of errors in 1 line of code (not too shabby).

The bugs that never die

Even after spending multiple days on these features, there are still some very bizarre bugs that no one knows what's happening to cause them.

Takeaways

Overall, whilst some parts were very frustrating, contributing to Apollo was really fun. I got to learn a lot about git, docker, and APIs. I also got to work with some really cool people and learn a lot about how to write good code. Also, this is the first time I've written code that has actually been used by people, seeing people enjoy something you have written is a really cool feeling. I would highly recommend contributing to open-source projects as it is a great way to learn and develop your skills. Finally, this is my first blog post so if you have any feedback please let me know!

PS: Apollo's source code can be found at https://github.com/UWCS/apollo