Functional Prototype 1

We built a prototype of a gesture-based interactive application for 3D world exploration. Our goal was to create an experience in which users can easily and intuitively explore a virtual world augmented by digital representations of real world events. In our implementation, we used Microsoft WPF and Official Kinect SDK for gesture recognition, Google Earth for spatial navigation, and Twitter to represent real world events. In addition, our application was designed to be displayed on a large screen or projector, since screen size has a direct effect on how immersive the experience is and how natural the gestures feel.

Features & Rationale

Microsoft WPF + Kinect SDK: We chose to use this technology for gesture programming because 1) we had previous experience with the development environment/framework and 2) it can integrate with Google Earth

Google Earth: Based on feedback from our mentor, Christine, the maturity of the API, and support for native and web versions, we decided to use Google Earth for spatial navigation.
Street View: While we thought about having both map and street views, we chose to focus only on street view navigation in order to fully develop an exciting, rich experience.

Screen Size: As we tested our Wizard-of-OZ prototypes on various screen sizes, we realized that a big screen really created a more immersive experience. The large scale mirrors the real world more accurately, allowing our gestures to feel natural. Turning with the shoulders doesn’t map as intuitively with a small screen interface. Overall, a large screen lends itself towards our goal of recreating a “life-size experience”.

Gestures: Based on last week’s Wizard-of-Oz testing, we realized that having a lot of gestures confused users because it was too much to remember. With that, we hoped to alleviate the mental workload and decrease the chances of incurring false positives by focusing our gesture set to be minimal in quantity, but maximum in quality (cover the basic functions and feel natural).

It’s not fatiguing… it’s really easy to do.. I could do this all day and not get tired – Brad

  • Forward/Backward gesture: Feels natural, not tiring, can be used in conjunction with the shoulder turning gesture (moving in diagonal)
  • Shoulder Turning: leverages familiarity with turning in the real world so this is easy to understand and remember, therefore feeling more natural.
  • Birdwatcher: originally designed to bring up a menu, changed the function to enable looking at or inspecting something more closely

Real World Events: While we believed that having this rich and interactive way to virtually visit and explore locations, we were not sure if the user experience would be compelling enough for users to use the application over more traditional ways. So we thought, what if, as you visit these “virtual” places you could see what real people are experiencing in these locations? How might that change the user experience?

  • Twitter: We chose twitter because it has a lot of activity, provides free access to its data, which is already geo-tagged (some of it, anyway), making it easier for us to plot in our virtual world.

Visual Feedback: After initial testing we realized that it was really important, from a user experience perspective, to know when the system detected a gesture and which action was triggered as consequence. It was also very important to know who actually “held control” over the Kinect. Therefore we decided to add a small window that showed Kinect’s input to provide simple visual feedback.

Current Implementation Progress

Gestures: We wrote a new custom controller based on the provided Skeleton Controller to recognize 3 gestures (walking forward + backward, turning left + right, birdwatching). The smaller window with the video stream and targets were kept for devoloping and debugging purposes as well as to provide visual feedback.

Google Earth: We integrated Google Earth by putting a web browser in WPF. We used simulated keyboard strokes.

Twitter: We aim to integrate the functionality from Twitters to Google Earth for social interaction between users. In order to implement this integration, it is required to implement two files: a kml file and an html file. The html file loads the kml file to import the content in KML. Then the contents of the file are added into the Google Earth instance.

KML is a file format used to display geographic data in an Earth browser such as Google Earth, Google Maps, and Google Maps for mobile. KML uses a tag-based structure with nested elements and attributes and is based on the XML standard.

In a HTML file, the KML file is loaded. In order to mark a position on the Earth’s surface, we employed some Placemarks on the map using a yellow pushpin as the icon. The Placemark includes only a <Point> element, which specifies the location of the Placemark. We described what the tweet contains under a <description> tag.

To Do List

Here is a list of action items (ordered by priority) inspired from Wednesday’s in-class demo and testing:

1) Add gesture to control speed (multiple options)

    • Use length of the step to determine the speed
    • Hand based gesture to turn speed up/down (analogous to volume control)

2) Add “rescue” and control gestures

  • Meant to help the user reset the location, start/stop the interaction

3) Improve the way we provide visual feedback (multiple options)

  • Overlay partially transparent skeleton data over the street view
  • Add additional widget elements to let the user know which gesture is being recognized and therefore which actions is taking place(i.e. arrows on the edges of the screen that trigger/light up as the user performs a gesture successfully)
  • Provide a visual cue when the system has successfully locked into a user (analogous to an on/off switch)
  • “Lock” Kinect to a specific user ( to avoid other people taking control over it)

4) Plot “significant” amount of tweets, try to automate the process as much as possible

  • Create custom icon for tweets
  • Create custom view to display their contents

5) Provide context information

  • Label describing where you currently are (City name, for example)
  • Perhaps add a contextual top map view widget
  • Compass

6) Design a proper interface/layout (aka “prettify” the application)

  • Still aiming to have the 3D view take most of the screen, for the sake of immersiveness

7) Create gesture to “look up/down” (tilting the camera angle with your neck/head)

8) Create gesture to adjust altitude (aka FLYING!)

9) Allow exploration of “cool” places (i.e. Mars, Moon)