COMPX241 Project Ideas

Play it again Sam, but this time with ...

Project Manger: Max Bracken

Team Members: William Malone; Maddy Martinez-Rodriguez; Henry Hitchins

Weekly Update Time: Wed 2-3pm

Key Idea: Develop a music playing app that makes subtle—or even perhaps, not so subtle!—changes to the music it plays, with the aim of letting you hear parts of the song mix that you might not normally hear.

There's not too much to add to this project description, as the key idea pretty much lands what the project is. In terms of how you might go about this, there are some simple audio filtering techniques that would allow you to make some subtle changes to what is played. This really only clips at the heels of the project, however. There are more advanced audio processing techniques that have been developed that can adjust and adapt the music tracks in more interesting ways.

As a basis for the project, a key technique to consider is Music Source Separation, which takes existing music and seeks to separate out the different sound sources that are present in the audio. Having got the sound sources separated, then you can apply more basic forms of audio filtering, such as adjusting the volume of the different tracks, amplifying and suppressing certain frequency ranges even, before combining the audio sources back into a track for playback. Other examples of advanced audio manipulation techniques are Pitch Tracking and Pitch Shifting, which present further opportunities for how the audio might be manipulated in interesting ways.

Useful links:

The Infinite Jigsaw

Project Manger: Ralph Huey Olegario

Team Members: Alex Rhodes; Mahaki Leach; Rafeea Siddika

Weekly Update Time: Wed 2-3pm

Key Idea: You complete a jigsaw of a famous painting or photo, but then—drawing upon the capabilities of a Generative AI Art tool—your jigsaw app updates, and there is more jigsaw to complete!

I picture (no pun intended) this project as a web-based environment. Whether you make use of an existing open-source code-base for the jigsaw playing capability, or right one from scratch, is up to you. You should certainly spend time looking at some existing solutions, as at the very least it will give you some ideas on what to do, and how the interactions could work. One that caught my eye was CrowdJigsaw which takes an interesting collaborative angle, that I think is a good fit (again, no pun intended) with the general idea of why undertaking an infinitely extending jigsaw would be engaging.

Getting more into where the value add of this project lies, the technical challenge this project needs to solve is how to develop a programmatic way of generating and downloading the new regions to add into the jigsaw. Most of the Generative AI Art tools are only available as online interactive websites (e.g., DALL.E-2 and Midjourney). One way to programmatically engage with such sites is to look at adapting browser-based user interaction automation testing tools, such as Selenium. In the case of Stable Diffusion, note there is a publicly available version of the software, and so this offers a richer, tightly integrated way to produce generative AI art—once you've learnt how to install and compile up programs that use it.

Believing is Seeing

Project Manger: Colby Deed

Team Members: Edward Wilson; Aditya Shairna; Justin Poutoa

Weekly Update Time: Wed 2-3pm

Key Idea: Use Generative AI Art to produce a pair of pictures: one that historically could have occurred, and the other that could not have

The new online game that's sweeping the world: you get shown two photos, both of them generated using Generative AI Art, however one of them is plausibly correct, and the other is not. A photo of Charles Darwin using an early form of telephone. A photo of Alexander Graham Bell reading a copy of Winnie-the-Pooh. But which is correct?

I like the idea for this game, and would definitely be interested to play it. But how to go about producing a software environment that enables you to create such content for the game? This is the crux of the issue, for this project, in being able to develop such a game. But just to be clear, having a running version of the game that people can play is part of it.

The direction I would take is to produce a visualisation tool that allows you to plot interesting artefacts on a timeline. These artefacts could be, for instance, well known historical figures, inventions, or events. In the case of historical figures, the timeline would chart when they were born, and when they died, using date information sourced from Wikipedia, say. Similarly inventions and events could be plotted on the timeline, although they would typically be a single point representing when they occurred. To produce a picture that could be potentially true, select two artefacts that overlap, and use that information to ask a Generative AI Art programme to generate a picture. To produce a false picture, select two artefacts that don't overlap.

Some details to consider include: how far apart (or the extent of the overlap), which can be linked to the level of difficulty in determining if a photo is plausible or not; and to factor in where a person was known to have travelled to, as it could be that even though they were alive at the right time, the fact that they never travelled to where the (for instance) event took place, means in reality it could not have happened after all.

Taking the information from the visualiser, some manual experimentation with an Generative AI Art program would then be undertaken to generate the image use, which would then be followed by a simple mechanism to download it so it can be incorporated into the game. Say 5 rounds of two photos to a game with a high score table? (Note: I plucked the 5 pretty much from out of the air!). See The Infinite Jigsaw above for links to example Generative AI Art websites, if you're not already familiar with these sorts of sites.

In thinking of users playing the game, back in the photo creation phase you might like to vary the photos produced through rendering types such as photo (e.g., black and white for historic people) and painting (e.g., oil painting when going further back), even vary the artists painting style. When the user sees the photo, there is scope of a bit of game play development: do they see a text caption beneath the image straight from the start? Or perhaps for a small loss of points, they can ask for a hint, which reveals the text. Maybe even going one further and revealing the text that was entered to generate the photo (which could potentially give away a bit more information as to why the image composition was set the way it was).

In any event, when the user makes their guess—right or wrong—the program then reveals some text that explains which one is correct, and why, and why the other one is wrong.

Rather than resorting to page-scraping content from Wikipedia, a more machine readable form of content can be accessed via Linked data representations of Wikipedia, via DBpedia and/or WikiData.

Linked Data Resources:

An example based introduction to linked data
VizQuery has neat example of retrieving of pictures by self-portraits by Van Gogh at the VG Museum in Amsterdam
DataViz demonstrates how to take the linked data retrieved and transform it into a visualisation.
See WikiData for a fuller list of visualisation tools.
Don't for get to look at the photos of cats sample query accessible through the base WikiData query page.
Linked Jazz

My em/ai/l

Project Manger: Samuel Tritscher

Team Members: Emma Campbell; Tanner Rowe; Kyle Doms

Weekly Update Time: Wed 1-2pm

[previously: My Point of View (POV) Email Server]

Key Idea: Develop a software capability that screens in-coming email messages for you, seeking to automatically address common pitfalls that you encounter: a student writing to you but not mentioning which of your courses they are in; an email message that says there is an attached document, but no attached files are present.

The project idea was originally conceived as closely tied to an email server. A broader scope for the project would be to look at using something like Power Automate to introduce more general text-based AI capabilities. The idea is rather on trend. See for example Google's recent annoucement about incorporating AI features into it Google Workspace.

Keywords: Natural Language Parsing (NLP); Web Email API

To push the idea a bit further, the idea of this project is to develop an environment that allows a user of myPoV-ES (needs a better name!) to develop their own set of rules and actions that occur. I use GMail, and would like an area in the interface where I can express rules that give me the sort of behaviour above. But what sorts of other monitoring and automated replies would be possible? Is it possible to parameterise aspects of the rules? After all, for my example, there might be a few things that I'd like clarified: not only the course, but which assignment or lab exercise they are talking about. I wouldn't want to have to generate separate rules and messages for each variant.

With a bit of planning, it should be possible to implement a solution that is agnostic about the email client if the solution developed took the form of an bi-directional mail server proxy, then this opens up the possibility of inserting it between your email client and the email server. As far as the email server is concerned it's talking to an email client (but it's not, it's your proxy). Likewise the email client you use: it thinks it's talking to the email server, but it is in fact your proxy again. In the in-coming direction, the proxy monitors things, and overall doesn't change things much. But when something comes in that triggers one of its rules, the proxy acts as an email client and issues an automatic email reply (then doesn't need to do anything else, as presumably the student then replies to that message including the additional requested details). In the out-going direction, when the user writes a message and hits send, it goes to the proxy. Nothing complicated here (except maybe to monitor for things you might have forgotten to do, like include the attachment!), just send the message straight on to the email server.

While checking for attachments is a feature that appears in several email clients (and so not the most original idea for myPoV-ES), with the rule-based aspect to the project, you could certainly customise what it is checking for. It's not that much of a stretch to imagine the situation where you have a particular way of phrasing this that isn't picked up by the email client. In this vein, you could also include out-going rules that:

Spot and correct typos and spelling mistakes that slip through: the the, for example;
Fix spurious capitalization, such as when you sign off with your name DAvid for the umpteenth time;
Or else provide other forms of assistance changing abbreviations into fully expanded form, or how about Uni course codes changed so the title of the course is also included?
A biggie in this regard is spotting an out-going message that, in a heated (or otherwise incapacitated!) state of mind you are likely to regret sending later on. Applying the technique of Sentiment Analysis to text is one way to approach this.

Useful Links:

If-This-Then-That (IFTTT) by way of inspiration
The Java Apache Mail Enterprise Server (JAMES). [They have definitely done a better job on the naming from than myPoV-ES!!] Note: this is just one example of an Open Source email server. I've chosen to highlight it as it includes Mailet containers, a mechanism for incorporating "independent, extensible and pluggable email processing agents".
is an example JS library for customising GMail
Alternatively, take a look at Open Source email clients (just an example listing), as a way of providing a coding base from which you can embed your rule-based interface.
Here's another article discussing an alternative to gmail. Note the description of Zimbra which provides both an email server and web-client.
There might be some mileage in officially provided extension mechanisms such as Gmail Add-ons (some assessment needed).
Alternatively, take a roll-your-own approach and glom on to an existing commercial API, such a GMail, through a JS library such as gmail.js

The World According to Me!

Project Manger: Tony Nugroho

Team Members: Alma Walmsley; Nathan Suakes; Daniel Jensen; Agnes Andersen

Weekly Update Time: Wed 1-2pm

Key Idea: Develop an extension to a web browser that lets you seamlessly edit any web page you are on, storing the changed version locally. When you visit the same page again, your edit version is the one that is displayed.

The project is a revitalisation of a project called Seamless Webpage Editing, or Seaweed for short. With Seaweed installed in your browser (through a GreaseMonkey extension), I could visit any web page, and edit if I want to. I'll just repeat that: I could visit any web page and edit it. We added in some basic backed-end storage capability so edited changes could be saved and restored when you visited the web page again, but otherwise left the work as a proof-of-concept project research project.

Since our work, web technologies have moved on, and there are now more robust ways to go about providing the core functionality we developed. In particular the CKEditor has an inline editing which delivers the crucial ability to edit a web page without reloading it to activate its editing ability ... where what is meant by "editing" is a much more richly developed capability than Seaweed had. (Note: when I previously check CKEditor4 had this ability, and while CKEditor5 was out, it was only newly out, and from what I could see didn't yet have the inline capability. It could have since been added, or alternatively, the 'no need to reload the page' editing capability was already there in CKEditor5, but under a new name.)

My vision for this project is to create a web browser environment where I can get to perform my regular browser activities. Any time I see content that I want to change, be it:

to highlight part of the page to help draw attention to the part of the page I found useful, should I return to the page at a later date;
delete sections of the page that are just distracting, extraneous blocks that obscure what I want to see;
fix an annoying typo that forever lives on and the content creator of the page isn't going to change; or
Add an annotation to a region of the page, so I can store details in notes that I would like to be related to the page.

With suitable crafting of the software architecture to this project it should be possible to create 'namespaces' in which all the edits/annotations are stored in, and can be shared with other users. The editing environment would then let the user switch between them. Let's say there is davidb as my personal namespace, but I also created one called compx203 for Computer Systems. The latter is shared with students in COMPX203, and when those users access the web, they will encounter content with highlighting, notes, and other editing to help bring out the relevance to the course. I would even push the idea further and say that a namespace could specify whether or not the ability to edit is only possible by a restricted list of users, or anyone accessing the namespace can edit.

A challenging problem to this world of editing are websites with dynamic content that changes regularly. Pretty much the home page to every interesting high-volume website out there! Since undertaking the original Seaweed project, I have thought of a technique based on hashing that can be leveraged to allow the idea to work on these types of site too.

The basic rough-cut idea runs as follows:

There is a relatively simple Javascript program that can be written that recursively traverses the Document Object Model (DOM) that is formed when a web page is loaded into a browser.
The first time the World According to Me! (WAM!) encounters a page, it runs this traversal algorithm, and for each node in the DOM it computes a hash value based on that node's innerHTML, which it stores as a data attribute in that node.
WAM! then saves this modified page to the backing store infrastructure it provides.
The user is then free to edit, update, delete, annotate the page as they see fit. Any edits are saved to the WAM's backing store infrastructure.
Later on, if the user visits the (dynamically changing content) website again, it goes through its hash-computing DOM traversal again. For these newly computed hash-values, if they match the WAM! backing-store saved value, then we have established a point of correspondence—even if where it is on the page is no longer the same (e.g., it has moved further down the page). For any of these points of correspondence, if WAM! has an edited version of that hash-value, then it should take its version, and use it to replace that part of the newly loaded page.

That is not to say that developing such an algorithm will be straightforward to achieve. It will be challenging for sure, and there will be case edge-cases that the above will have not considered, however it does lay out a basic approach for how to track content that moves where it is positioned on a page over time.

A Fountain of Information

Project Manger: Joel Shepherd

Team Members: Sabina Han; Cymone Jacob; Hans Lomboy

Weekly Update Time: Wed 1-2pm

Key Idea: A virtual 3D fountain where the jets of water displayed is a representation of some interesting information that changes over time.

The Data Fountain is a thought-provoking project where a team has built a physical fountain with three jets of water, where each jet is respectively linked—in realtime—to the exchange rates of the US dollar, the Euro and the Yen. The height of a jet is proportional to the currency exchange rate that it represents, with its height updated every 5 seconds.

This project is about taking the idea of the Data Fountain, and moving it into the digital realm. Without the same physical constraints, there are a lot more possibilities to explore: number of jets, the type of jets, how the water flows, the colour of the water, a range of different sounds could even be added in.

As to the source of information, this too is highly configurable in a digital version. A trivial extension would be to expand upon the three currencies of the Data Fountain: crypto-currencies would be an obvious choice here given how they have hit the news; the share price of companies is another variation of this in the same vein.

But there are so many other sources of information. Put into the context of a Digital Data Fountain display in university, what sorts of source data would be interesting to map to how the fountain display changes: the volume and different types of data passing through the university network? A fountain display that captures how many lectures are presently going on, with aspects to the display that indicate at what year of study, and what subject?

Looking to bring in inspiration from further afield, check out the Reddit Data is Beautiful thread. Thinking about many of these example sources of data brings out the idea that the source for the fountain display doesn't necessarily have to be real-time. A static data-set could equally be used, as long as there is a meaningful progression through the data. For instance, a fountain display based on Tracked my student loan from beginning to end could start out looking like a regular fountain, which then becomes more distorted at the debt mounts, and then gradually returns to its original form as the student loan is paid off. An additional data feature to represent is when they graduated, and so there could be the start of something new and different in the fountain display that grows as the debt is paid off, presumably through the employment they now have as a result of their degree.

So the sort of capability this software project is looking for is a 3D rendering environment where jets/waterflow can be displayed, with the viewpoint looking at the fountain changed over time, ideally with sound effects also being played. This rendering capability can be connected to a variety of different data sources, with some configurations being set, and then the fountain is "switched on".

There are many 3D graphics environments to choose from. Following my adage, you should always look to make a new project be web-based unless there is a sound technical reason why it cannot be, then a good starting point would be assessing the capabilities of WebGL.

And as a final comment on the trajectory this project takes, the visualisation the project develops doesn't even really need to be a fountain! It captures the origins of the project idea, and certainly has pleasing connotations with a physical form that people are inclined to gravitate towards, and stand and watch for a while. But maybe you can think of something else that is equally —or more!—engaging?