Grey Power, Trip |
|
Key Idea: A map-based travel planner that caters to the needs of the elderly
When it comes to travelling, taking a trip—whether it be a day excursion or something longer—presents a raft of logistical issues that you don't typically face in day-to-day life. In my family we have a saying, “definitely out of our comfort-zone,” used at times to help identify when tensions might be rising due to the situation we are in. My family are a pretty capable bunch, but the phrase turns up more than a few times when we're travelling! As you get older, those challenges typically increase both type and frequency, and this is where the focus of this project lies: develop a travel planning and on-the-road app that is tailored to meeting the needs of the elderly, variably called the silver-surfer generation, and grey power. There are various factors to think about in developing such an app. For instance, what is the level of mobility of the person/people making the trip? If one of the passengers makes use of a wheelchair, then it would be helpful if the travel app highlights where disabled parking spaces are. Or how about the eye-sight? I don't recall seeing Google Maps coming with a feature that lets users make the actual landmarks larger to see, for example. Worth double-checking for sure, but the Grey Power, Trip could well be the first travel app to include such a feature. Once you start thinking about the issue of designing a travel app that caters to the elderly, there will be other ideas that come to mind. Users of the target demographic are typically retired, and so should also find it easier—and, given the increased level of complexity to making a trip, be more motivated to—spend time in advance of the trip, planning things out. For such planning work, it would make sense for this to be done on a desktop computer or laptop, with the outcome of such planning ideally being transferred to a phone or tablet for the trip itself. |
|
For Later, Dude (Let's talk about this at another time) | |
Key Idea: Make it as easy as possible to record yourself, wherever you are, speaking (ideas, instructions, etc.) that will later on be interpreted with an online Siri-type application.
Imagine the scenario: I'm driving to work when a song comes on the radio. After a few bars into the song I remember that this is a song I've been meaning to find out the name of—trouble is the DJ has already said upfront what the song was and I wasn't listening that closely. In theory I could get out my phone, enter its PIN, scroll through the apps to find Spotify, and then get it to sample the song that is playing. That's enough of a pain to do if you're a passenger in the car, and with all that faffing around the song might have very well finished in the meantime ... but remember I'm the driver! Yikes!! Think how more convenient it would be if I had, say, a pen clipped into my shirt pocket that's also a dictaphone: I could slide it out, click the record button and say For Later Dude, find out what song this is and then let the pen/dictaphone record some of the audio that's playing. Subsequently, when I'm at my desk at work, I plug in the dictaphone pen (it's also a USB thumbdrive, don't ya know!), and the Later Dude software I have installed on my PC scans the device for new content, which it then applies Speech-to-Text software to, and actions the result. I now know the name of that song playing on the radio ... have changed the time of a scheduled meeting in my calendar ... and added into my Project Ideas doc for COMPX241 this new nifty idea I had, while making breakfast, about making it easy to record audio wherever you happen to be, and then later on have it processed! As an additional thought, to round out the idea, given that the recording device acts as a USB thumbdrive, then the "For Later, Dude" code that you write for processing the audio can be stored on that same USB drive. With a bit of care, you should even be able to set things up so the processing software can run on whatever host machine it is plugged into, be it Windows, MacOS or Linux. Maybe you could even get Android in your sights, as a phone that support OTG (On-The-Go), which is fairly common these days, will allow for USB devices to be plugged in to it (i.e., the phone, the tablet), and let the plugged in device appear as a disk. Potential frameworks, tools and API to build upon:
|
|
Play it again Sam, but this time with ... |
|
Key Idea: Develop a music playing app that makes subtle—or even perhaps, not so subtle!—changes to the music it plays, with the aim of letting you hear parts of the song mix that you might not normally hear. There's not too much to add to this project description, as the key idea pretty much lands what the project is. In terms of how you might go about this, there are some simple audio filtering techniques that would allow you to make some subtle changes to what is played. This really only clips at the heels of the project, however. There are more advanced audio processing techniques that have been developed that can adjust and adapt the music tracks in more interesting ways. As a basis for the project, a key technique to consider is Music Source Separation, which takes existing music and seeks to separate out the different sound sources that are present in the audio. Having got the sound sources separated, then you can apply more basic forms of audio filtering, such as adjusting the volume of the different tracks, amplifying and suppressing certain frequency ranges even, before combining the audio sources back into a track for playback. Other examples of advanced audio manipulation techniques are Pitch Tracking and Pitch Shifting, which present further opportunities for how the audio might be manipulated in interesting ways. |
|
Waiata For All |
|
Key Idea: At a pōwhiri when a waiata starts, your phone discreetly brings up the song that is being sung, with music notation and lyrics displayed to help you join in.
To hit the mark on this project, there are a few steps that need to be implemented as a precursor to achieving the end result. Implicit in the above description is some sort of repository of waiata to match against. Where did that come from? The answer: your project! How were those songs entered in? The answer: your project! So, to start a bit further up the chain, as part of this project you will develop an environment that allows users to enter the musical notes to a waiata (i.e., in a symbolic format). Unlike the use-case scenario expressed through the key idea, the composing step doesn't have to be done using a phone—although it's a good idea to keep this option in the mix if you can. This touches on a mantra I have formed as the result of many years of research-led software development, which is: when starting a new project, design it as a web app unless there is a technical impediment that prevents you from doing so. And it is increasingly easy to do these days. In particular, there are web-app development stacks readily available, such as React Native and Meteor that rival the capabilities of developing bespoke desktop applications. The former approaches, however, do so using only web technologies, essentially: HTML, CSS, Javascript. Because of the approach they take, when you press the deploy button it is just as easy for these development environments to produce an app that runs on a desktop computer as it is to operate on a mobile device, and having a web-site version where nothing special needs to be installed by the user and means it can be accessed from wherever they are, comes for free. To get into a bit more detail about entering waiata. The idea would be to design an interface that lets someone who does not necessarily have formal musical training, to do this. But I recommend you start with more modest goals in the first instance. As a foundation, decide on a simple text format and let people enter that directly. To this you could then add a musical keyboard on screen, and then anyone who can play piano (whether formally trained or not), can enter a song more conveniently using that. They might not play all the right notes, and so being able to adjust what is entered would also be a feature to support. A step after that would be to allow the user to sing the song as another way to get it entered. On this occasion an error turning up in the symbolic form of the song is just as likely an error in the transcription process, but one that can be resolved the same way by having an editing feature. Libraries and algorithms for all these steps already exist. Just not likely to be in exactly the form and shape you need them to be in! You'll also want the user to enter some metadata about the song: what its title is, which iwi (if any) it is associated with, and so on. Traditional Knowledge (TK) labels, an initiative similar in spirit to Creative Commons, for expressing rights around indigenous content, is likely to have a useful role here.
|
|
The Infinite Jigsaw |
|
Key Idea: You complete a jigsaw of a famous painting or photo, but then—drawing upon the capabilities of a Generative Art tool—your jigsaw app updates, and there is more jigsaw to complete! I picture (no pun intended) this project as a web-based environment. Whether you make use of an existing open-source code-base for the jigsaw playing capability, or right one from scratch, is up to you. You should certainly spend time looking at some existing solutions, as at the very least it will give you some ideas on what to do, and how the interactions could work. One that caught my eye was CrowdJigsaw which takes an interesting collaborative angle, that I think is a good fit (again, no pun intended) with the general idea of why undertaking an infinitely extending jigsaw would be engaging. Getting more into where the value add of this project lies, the technical challenge this project needs to solve is how to develop a programmatic way of generating and downloading the new regions to add into the jigsaw. Most of the AI Generative Art tools are only available as online interactive websites. One way to programmatically engage with such sites is to look at adapting browser-based user interaction automation testing tools, such as Selenium. In the case of Stable Diffusion, note there is a publicly available version of the software, and so this offers a richer, tightly integrated way to produce generative art—once you've learnt how to install and compile up programs that use it. ![]() |
|
Believing is Seeing |
|
Key Idea: Use Generative AI Art to produce a pair of pictures: one that historically could have occurred, and the other that could not have The new online game that's sweeping the world: you get shown two photos, both of them generated using Generative Art, however one of them is plausibly correct, and the other is not. A photo of Charles Darwin using an early form of telephone. A photo of Alexander Graham Bell reading a copy of Winnie-the-Pooh. But which is correct? ![]() ![]() I like the idea for this game, and would definitely be interested to play it. But how to go about producing a software environment that enables you to create such content for the game? This is the crux of the issue, for this project, in being able to develop such a game. But just to be clear, having a running version of the game that people can play is part of it. The direction I would take is to produce a visualisation tool that allows you to plot interesting artefacts on a timeline. These artefacts could be, for instance, well known historical figures, inventions, or events. In the case of historical figures, the timeline would chart when they were born, and when they died, using date information sourced from Wikipedia, say. Similarly inventions and events could be plotted on the timeline, although they would typically be a single point representing when they occurred. To produce a picture that could be potentially true, select two artefacts that overlap, and use that information to ask a Generative AI Art programme to generate a picture. To produce a false picture, select two artefacts that don't overlap. Some details to consider include: how far apart (or the extent of the overlap), which can be linked to the level of difficulty in determining if a photo is plausible or not; and to factor in where a person was known to have travelled to, as it could be that even though they were alive at the right time, the fact that they never travelled to where the (for instance) event took place, means in reality it could not have happened after all. Taking the information from the visualiser, some manual experimentation with an AI Generative Art program would then be undertaken to generate the image use, which would then be followed by a simple mechanism to download it so it can be incorporated into the game. Say 5 rounds of two photos to a game with a high score table? (Note: I plucked the 5 pretty much from out of the air!). In thinking of users playing the game, back in the photo creation phase you might like to vary the photos produced through rendering types such as photo (e.g., black and white for historic people) and painting (e.g., oil painting when going further back), even vary the artists painting style. When the user sees the photo, there is scope of a bit of game play development: do they see a text caption beneath the image straight from the start? Or perhaps for a small loss of points, they can ask for a hint, which reveals the text. Maybe even going one further and revealing the text that was entered to generate the photo (which could potentially give away a bit more information as to why the image composition was set the way it was). In any event, when the user makes their guess—right or wrong—the program then reveals some text that explains which one is correct, and why, and why the other one is wrong. Rather than resorting to page-scraping content from Wikipedia, a more machine readable form of content can be accessed via Linked data representations of Wikipedia, via DBpedia and/or WikiData. Linked Data Resources:
|
|
Learning Your Lines, Expeditely! |
|
Key Idea: Use the Open Source Spatial Hypermedia System Expeditee to create a bespoke digital environment to support an actor learning their lines. Expeditee is an open source spatial hypermedia system—developed here at Waikato under the leadership of Rob Akscyn—quite like any other information system you are likely to have encountered! It can be a word processor, a mind-map tool, a graphics visualisation system, and many other things besides! We've been experimenting with it as an environment in which to compose and create new music. The task proposed for this Smoke and Mirrors project is to look at how it might be used to help learn your lines and rehearse for a drama production. The sorts of features that could be developed are:
|
|
My em/ai/l |
|
[previously: My Point of View (POV) Email Server]
Key Idea: Develop a software capability that screens in-coming email messages for you, seeking to automatically address common pitfalls that you encounter: a student writing to you but not mentioning which of your courses they are in; an email message that says there is an attached document, but no attached files are present. The project idea was originally conceived as closely tied to an email server. A broader scope for the project would be to look at using something like Power Automate to introduce more general text-based AI capabilities. The idea is rather on trend. See for example Google's recent annoucement about incorporating AI features into it Google Workspace.
To push the idea a bit further, the idea of this project is to develop an environment that allows a user of myPoV-ES (needs a better name!) to develop their own set of rules and actions that occur. I use GMail, and would like an area in the interface where I can express rules that give me the sort of behaviour above. But what sorts of other monitoring and automated replies would be possible? Is it possible to parameterise aspects of the rules? After all, for my example, there might be a few things that I'd like clarified: not only the course, but which assignment or lab exercise they are talking about. I wouldn't want to have to generate separate rules and messages for each variant. With a bit of planning, it should be possible to implement a solution that is agnostic about the email client if the solution developed took the form of an bi-directional mail server proxy, then this opens up the possibility of inserting it between your email client and the email server. As far as the email server is concerned it's talking to an email client (but it's not, it's your proxy). Likewise the email client you use: it thinks it's talking to the email server, but it is in fact your proxy again. In the in-coming direction, the proxy monitors things, and overall doesn't change things much. But when something comes in that triggers one of its rules, the proxy acts as an email client and issues an automatic email reply (then doesn't need to do anything else, as presumably the student then replies to that message including the additional requested details). In the out-going direction, when the user writes a message and hits send, it goes to the proxy. Nothing complicated here (except maybe to monitor for things you might have forgotten to do, like include the attachment!), just send the message straight on to the email server. While checking for attachments is a feature that appears in several email clients (and so not the most original idea for myPoV-ES), with the rule-based aspect to the project, you could certainly customise what it is checking for. It's not that much of a stretch to imagine the situation where you have a particular way of phrasing this that isn't picked up by the email client. In this vein, you could also include out-going rules that:
Useful Links:
|
|
The World According to Me! |
|
Key Idea: Develop an extension to a web browser that lets you seamlessly edit any web page you are on, storing the changed version locally. When you visit the same page again, your edit version is the one that is displayed. The project is a revitalisation of a project called Seamless Webpage Editing, or Seaweed for short. With Seaweed installed in your browser (through a GreaseMonkey extension), I could visit any web page, and edit if I want to. I'll just repeat that: I could visit any web page and edit it. We added in some basic backed-end storage capability so edited changes could be saved and restored when you visited the web page again, but otherwise left the work as a proof-of-concept project research project. Since our work, web technologies have moved on, and there are now more robust ways to go about providing the core functionality we developed. In particular the CKEditor has an inline editing which delivers the crucial ability to edit a web page without reloading it to activate its editing ability ... where what is meant by "editing" is a much more richly developed capability than Seaweed had. My vision for this project is to create a web browser environment where I can get to perform my regular browser activities. Any time I see content that I want to change, be it:
With suitable crafting of the software architecture to this project it should be possible to create 'namespaces' in which all the edits/annotations are stored in, and can be shared with other users. The editing environment would then let the user switch between them. Let's say there is davidb as my personal namespace, but I also created one called compx203 for Computer Systems. The latter is shared with students in COMPX203, and when those users access the web, they will encounter content with highlighting, notes, and other editing to help bring out the relevance to the course. I would even push the idea further and say that a namespace could specify whether or not the ability to edit is only possible by a restricted list of users, or anyone accessing the namespace can edit. A challenging problem to this world of editing are websites with dynamic content that changes regularly. Pretty much the home page to every interesting high-volume website out there! Since undertaking the original Seaweed project, I have thought of a technique based on hashing that can be leveraged to allow the idea to work on these types of site too. The basic rough-cut idea runs as follows: That is not to say that developing such an algorithm will be straightforward to achieve. It will be challenging for sure, and there will be case edge-cases that the above will have not considered, however it does lay out a basic approach for how to track content that moves where it is positioned on a page over time. |
|
A Fountain of Information |
|
Key Idea: A virtual 3D fountain where the jets of water displayed is a representation of some interesting information that changes over time. The Data Fountain is a thought-provoking project where a team has built a physical fountain with three jets of water, where each jet is respectively linked—in realtime—to the exchange rates of the US dollar, the Euro and the Yen. The height of a jet is proportional to the currency exchange rate that it represents, with its height updated every 5 seconds. This project is about taking the idea of the Data Fountain, and moving it into the digital realm. Without the same physical constraints, there are a lot more possibilities to explore: number of jets, the type of jets, how the water flows, the colour of the water, a range of different sounds could even be added in. As to the source of information, this too is highly configurable in a digital version. A trivial extension would be to expand upon the three currencies of the Data Fountain: crypto-currencies would be an obvious choice here given how they have hit the news; the share price of companies is another variation of this in the same vein. But there are so many other sources of information. Put into the context of a Digital Data Fountain display in university, what sorts of source data would be interesting to map to how the fountain display changes: the volume and different types of data passing through the university network? A fountain display that captures how many lectures are presently going on, with aspects to the display that indicate at what year of study, and what subject? Looking to bring in inspiration from further afield, check out the Reddit Data is Beautiful thread. Thinking about many of these example sources of data brings out the idea that the source for the fountain display doesn't necessarily have to be real-time. A static data-set could equally be used, as long as there is a meaningful progression through the data. For instance, a fountain display based on Tracked my student loan from beginning to end could start out looking like a regular fountain, which then becomes more distorted at the debt mounts, and then gradually returns to its original form as the student loan is paid off. An additional data feature to represent is when they graduated, and so there could be the start of something new and different in the fountain display that grows as the debt is paid off, presumably through the employment they now have as a result of their degree. So the sort of capability this software project is looking for is a 3D rendering environment where jets/waterflow can be displayed, with the viewpoint looking at the fountain changed over time, ideally with sound effects also being played. This rendering capability can be connected to a variety of different data sources, with some configurations being set, and then the fountain is "switched on". There are many 3D graphics environments to choose from. Following my adage, you should always look to make a new project be web-based unless there is a sound technical reason why it cannot be, then a good starting point would be assessing the capabilities of WebGL. And as a final comment on the trajectory this project takes, the visualisation the project develops doesn't even really need to be a fountain! It captures the origins of the project idea, and certainly has pleasing connotations with a physical form that people are inclined to gravitate towards, and stand and watch for a while. But maybe you can think of something else that is equally —or more!—engaging? |
|
Bingally Bong |
|
Key Idea: Develop an app that bingally bongs a person on their phone with information that might be of use to them, given the location they are in. Could be any sort of information, but note that the project was originally conceived as a bespoke mobile-phone travel app for families with kids travelling abroad.
For this project I imagine two distinct phases to the software app developed: Explorer mode and Wisdom of the Crowd. To be honest, when in Explorer mode, the app isn't that supportive—but that's OK, as you are the intrepid explorer! What it's doing though, is running GPS the whole time, and paying attention to when you seem to spend a lot of time in one location There's usually a reason for this, either good or bad, for such a "hotspot". Maybe you were figuring out how best to get into the city centre from the airport (Someone didn't plan ahead, did they Dad?). Or perhaps you stopped at a cafe (was it any good?), or were viewing one of the sights to see. Having run your app in Explorer mode, at the end of the day, when you plug in to your laptop (say), it shows you these hotspots on a map and asks you to enter some information to explain what was happening, which it stores centrally. The enriched information that is built up by the explorers feeds the Wisdom of the Crowd side of the app. In this latter mode when you find yourself at the airport, it vibrates to let you know there is information potentially relevant to where you are that it can show you. In this case, it could inform you of what previous people determined was a good course of action for getting into the city centre. This could even factor in the time of day that they did this. An added twist to the app in Explorer mode is that it lets you take photos, and/or is integrated with the GPS locations of photos you have been taking during the day. These might be useful to show someone using the Wisdom of the Crowd side of the app to help that user orientate themselves. Unfortunately we don't have the budget to send you to any exotic locations to trial the software you develop, however the ideas expressed in this project work equally well when applied to the idea of someone new to our university's campus. The above stated mantra about always developing as a web app (unless there is a technical impediment that prevents you) can be applied here. Some sort of back-end store will be needed for the explorer generated content. As food for thought, take a look around the Paradise Gardens showcase, which illustrates a technique for spatial/proximity searching, and is built using our very own Open Source Greenstone Digital Library software. For Bingally Bong (BB) to work as the content entered by Explorers grows, then measures need to be in place so those that follow are not continually spammed by information. There therefore needs to be a way to align the interests of those that follow with the body of information stored in the system. An interesting angle, then, to take could be looking to see if the app can be hooked in with an existing social media platform, such as Facebook. The idea here would be that BB could apply a Topic Modelling algorithm across the content that an individual has posted to Facebook, and from that establish a focused set of keyword/topics with which to filter the text content that BB has. |
|
Space and Time (All I Ever Wanted) |
|
For this project, I imagine an interactive visualisation interface that is based around a map view combined with a date/timeline. The map shows items and/or regions of interest, and if I adjust the date/timeline then more or less things are shown. I'd start by typing in some key terms in a search box to the interface to get things started—giving the app some idea of what I'm currently interested in. I can then start zooming in and out and panning the map view, and (just to repeat the point) messing with the start and end point of the timeline I'm interested in. When I click on an item of interest on the map, it centres on that. Provides a popup window of some form that lets me read more detailed information about that item. I can then also indicate that I want to update that the keyword associated with this item is my new focus (the equivalent of saying, this is now my search term). The app should also let you gracefully move from a map based view to a time-based view, and vice versa. In the time-based view the same information is displayed, only this time chronologically ordered. I can then interact with this view, perhaps expanding a family tree item, and making one of that person's descendants my latest focus of interest. Having done that, I then ask it to go back to the map view. Taking the idea of learning about historical events, a potential source of information could be to target particular areas of Wikipedia. And to take this one step further, worked with a Linked Data source souce as DBpedia, or WikiData. |
Smoke and Mirror Projects: From the VaultsThe Smoke and Mirrors brand has been a signature component of the Software Engineering programme at the University of Waikato from its inception. First run in 2003, it started life as a group-based project that 1st Year SE students, who had been learning to program in the C++ language, undertook. In 2010 it moved to the 2nd year level, with Java being the programming language taught, where it has remained since. It is one of the great pleasures in my job at Waikato to be involved in providing the Smoke and Mirrors experience for our SE students, and for so many years—for all of the years the projects have run in fact! There even came a point where I would be on sabbatical in the semester where Smoke and Mirrors was scheduled to run, however a year in advance the department had changed the semester it ran in, so it could continue running the projects. I haven't been able to locate any written record of the projects run in that first year, sadly. One from that year that does come to mind, however, was a team developing a software-based arbitrary-precision calculator. As part of their presentation to the department at the end of the semester, they demonstrated their GUI-based calculator calculating π to ... well ... a very high precision! For the years 2004–2009 I have been able to track down the titles of the projects that ran, which at least hints at the variety of projects undertaken. For more recent years, I still have the project briefs that the students start with, when the projects are launched. With a nod to posterity, here are the projects by year, working back from newest to oldest. |