I may have great music, cost but I must have great sound.
Great music will require hiring a composer, and or licensing pre-made music. That may come to fruition, check we’ll see. I’m a musician, but I don’t have the time or dedication to spend composing and recording a soundtrack. Someone else could do much better.
If I stumble across something I like, I will consider it. But I am not opposed to going without music. Limbo did it, and did it well.
Ambient sounds, on the other hand, I think are essential. And I’m fairly confident that with access to a diverse and affordable sound library (such as soundsnap), I can make this a reality. This is another system that, with any luck, could come together with emergent beauty.
Rain and wind are important things to get right. Of course, footsteps and weapon swings. And perhaps also spooky drones, or mysterious whispers in the night. Mountain streams rushing down the valley. Ocean noise, or lake waves gently lapping on the shore.
Currently I’m still torn as to whether to use XACT, or roll my own sound engine. XACT is complex but powerful. It’s really intended for sound designers, as it separates sound from code. The sound designer exposes a set of tweaks and variables to the programmer. All the programmer has to do is, for example, set the “EngineSpeed” variable on a sound. XACT lets you define how various “base sounds” cross-fade between each other, or vary in pitch or speed based on these variables. It can transparently play a random selection from a set of sounds.
Unfortunately XACT generates garbage in XNA’s managed code environment, and that can have a significant performance impact on runtimes without generational garbage collection (the Xbox 360).
XNA provides a very basic alternative sound API that avoids these issues. So my current preference is to build a simple sound engine on top of these APIs, which offers some of the basic XACT functionality. Going this route would also possibly make it easier to port to other frameworks in the future.
Voice work. I don’t plan to have any spoken word in my game. It would be a ton of work, expensive, and I think it can easily come off as cheesy if you don’t have it consistently throughout the game and have it be of high quality. If I can’t do it well or efficiently, I won’t do it.