Jim Sager's blog about Artificial Intelligence

Go here to read/post on forums

11/24/2013: Blog Entry 1 after 11 years off.



Hello, I was posting on Slashdot about how obvious Artificial Intelligence is to do right. Then I realized not everyone sees AI like I see it. I'm opening this blog to talk through AI. I'll do my best to make complicated matter basic. To read this blog it would help to have a basic knowledge of computer programming, but maybe you can get something without any.

I think I need to name this AI to start. It is based on an imagination space. So I'll call it Imagination Based Artificial Intelligence: IBAI. pronunciation: eye-bye

Let me explain this now:

Top Level: Think of any 3d video game. Objects interact with each other. And the computer can look at the state of objects in the 3d world to determine its next move vs the player.

Broken Down more specifically:

1) Cameras, GPS and laser range finders build a picture to a robot about its surroundings.
2) Complex software inside the robot must then try and understand what it is looking at and think of a 3d world in its head.
3) Knowing the world the robot is in, the goals the owner gave it and the actions that it can take on the environment, it will try to accomplish them.
Broken down even more:

1) Cameras, GPS and laser range finders build a picture to a robot about its surroundings.

1) The hardware is obvious, we have it now. No need to explain the tech of cameras other than the higher quality the better. There are autodriving cars with the hardware rigs on them now.

2) Complex software inside the robot must then try and understand what it is looking at and think of a 3d world in its head.

2) The complex software that determines what you're looking at can be broken down as this, and its hard to explain clearly because it is all interrelated: a) A memory space not different than 3d GPUs have to store data about what is seen.
-The key is that to represent what is seen, the objects need to be identified and created in a 3d mesh.

--3d mesh objects could be databased in a huge database of every object the team entered in of what they can think of doing from the world. An object might be a stapler, a pen, a computer, a car, a building, etc etc. One way of databasing objects in is to do them by hand with a 3d modeler.

---3d mesh objects can't just be what you find in a video game though. They need more data. You need to know how hard each part of the object is to know how they'll react to collisions for example. So each object should be reduced to parts that assemble into a larger object. For example a pencil might have a lead part, a wooden part, a metal part to hold the eraser, and an eraser. Already you should be thinking to yourself, this can be complicated and you're right, it is real time physics simulator which is capable of taking a 3d scene and guessing what might happen when things collide. Basically you're looking at a collision detection and analysis physics engine. You'd take what is known with collision detection and try and model up from the ground everything. If you're a physics guy you know stuff changes because force hits it, so a lot of what we know is collision detection into other things and what we expect is the result. So a light weight physics engine that doesn't calculate a lot is what we need to run at real time.

-----An example of something successful from like 40 years ago is SHRDLU SHRDLU's success is that it guessed what would happen based on basic objects interacting. I think someone could make a physics engine with basic features to form simple items to complex items if it is open source. I think down the line after my ideas are congealed here in blog/website form, I'll start an open source physics engine. My idea is to give every material a hardness. And when hard things collide with soft things, they can pierce, slice, or dent them to a degree. I think this would be hugely complex, and needed to be modeled with real life testing to make sure things are close

------So to back things up some, in order to even database objects in the system, you need to know how to represent the object with data, or its Garbage in and Garbage out. By reducing your models to simple shapes that have hardness and other data to them, you can build more complex models out of them. Probably instead of starting with numbers, the best thing to do is simply giving the object material types such as wood, metal, plastic, etc. You could have more descriptors to describe them in more detail because after all there's some difference between balsa wood and oak. But by describing objects generally as known material which is linked to numbers, you can change the numbers without databasing across all your objects which would just be bad programming anyway.

-------Again, we must first know the data format we're using before we database in a bunch of things. One of the steps is to database in a bunch of objects so we can simulate more complex scenes in the physics engine which is our imagination space. Also having databased in objects will help us identify which objects the camera is looking at to build a scene. If it is checking a scene for objects and matches a pencil image with a pencil in the 3d database, it could then know more about what it is looking at. That is one way to do it. I'm not really sure at the best way to do image recognition and 3d scene creation. If I could do that easily, I'd just code it up myself. I just know it is something that can be done even if the computer is checking the scene vs things in its 3d model database. It isn't easy to do this component. I've seen some little stabs at it. Look at Kinect. It can detect where your body is because it detects a human body. Maybe if they expand Kinect technology they can detect more things.

-------- So we have two major technologies that are required so fare: 1 is a physics engine that can run pretty descent in real time on top of the line computers you might put into robots. 2 is scene recognition so the cameras and laser range finders can imagine the space it is in currently. The scene recognition software takes sensor data and makes a representation of the world in its physics engine. If you think about it, you could make some kickin video games if you had this technology too. Just take the vision/rangefinding equipment around some buildings/forests/roads and database everything it sees. Then you'd have levels to play video games on :) An example would be if Google Cars had this tech on their street view and made it so you could play Need For Speed: Cannon Ball Run on your computer. The whole mapping of the USA would be there, so in theory you should be able to drive from New York to Los Angeles.

--------- Again, we have two techs needed: 1 is a physics engine like an advanced SHRDLU. 2 is scene recognition application which takes the sensor input and puts data into the physics engine. This is the bare minimum you need to start with. As you get advanced, you could allow the IBAI to start databasing in its own 3d models. Also not all trees look the same, so you'd need code to deal with vegetation and things out of the ordinary, but that could be its own algorithm extended on the physics engine.

3) Knowing the world the robot is in, the goals the owner gave it and the actions that it can take on the environment, it will try to accomplish them.

To be honest #2 is where most the AI is. #2 lets us have a data representation of the world. From here the AI is nothing more complex than maybe what you'd find in a video game... to start. Everyone talks about machine learning, but most of the AI coded at first will probably be done by hand. If you want to say move from point A to point B with the robot, you have the representation in your head, and you use one of the common known algorithms. Machine learning would most likely be used to calibrate analog outputs such as aiming if it was a baseball pitching robot for example.

I hate to disappoint most people that machine learning doesn't need to be a part of AI. Maybe someday down the road it can be built on, but it isn't needed to make fully functional robots which act like humans, but driven by code. Yes they'd obey us, and wouldn't have their own desires. It'd be reckless to make a robot like Bender who has a series of subgoals including drinking, carousing, and telling you to bite him. Everything is goal driven with subgoals. Since the robot IBAI would have knowledge of the world in its memory, maybe even networked to larger memory with more data on the world, it would have somewhat imperfect knowledge to make actions on. Everything would be goal driven with subgoals. So the main goal could be a bunch of coefficients of doing different things, or opportunity based actions, but all of this is originally told to the robot in natural language(AKA English) what it should do. Even if you wanted to make Bender, which would be trivial with IBAI, you'd first need to tell the robot how to act like that. So everything is initially instruction based goal driven, even if your robot's goal is to lay on the couch and waste space being your conversation buddy.

Now it could be a good conversation buddy because through the physics engine imagination space it has, you could ask it,"Can you turn the book The Hobbit into a movie for me?" It would read the natural language input(in any language once it is mature), and then it would create a scene in its imagination space where it could display on a video screen. You could give it details to fill in the scene,"Make Bilbo Baggins have a hat on his head that has a feather this time." and it'd change it up for you. In fact this is how the robot is interacted with: Natural Language(aka English)

A man might go,"Hmmm, if a robot can read a book and understand it, maybe it could learn." And yes, down the road a time, it could maybe if you knew what you're doing. But to start, you don't need to make making the IBAI too complicated. That is something that can be a feature that is added later. They say KISS, but nothing in AI is simple. Even explaining it here takes some work and I'm not even doing the hard code.

Again, the whole thing from #2 knows the environment, so just like a 3d video game, it can act on the environment based on its capabilities. If you'll note here, IBAI is independent of the robot's chassis and outputs. A robot could be a cuddly pet like a Furby, or it could be a mechanized terror, or both as anyone has seen a Furby with its skin off. The critical parts are that physics simulator and scene recognition software which make up the core of the IBAI.

For AI to achieve stuff inside that physics simulator, it could either read the raw data and make static algorithmic movements based on where everything is at in one time. Or it could get complicated and guess what each object in the scene will do. For AI to be functional, you don't need to be complex, but I believe this will be a field programmers get into solving problems for their own specific robot when adopting the IBAI. The key is that problem solving won't be that hard on new outputs, and a lot of the prepackaged algorithms could help. To top it all off, anyone could program the robot because it isn't done in BASIC/C/C++/JAVA or anything, but rather in Natural Language(English). You simply laundry list your AI how to do stuff and it'd do it, with some code for its outputs.

Well that's enough for tonight. I just wanted to get some of my basic ideas out.

Check this out to learn more about where I was 11 years ago. Not much has changed in the basics of describing AI.

12/10/2013: Blog Entry 2 after 2 weeks, for some blogs this might as well be 11 years :P Not for this one though. I'm only updating when I feel like adding ideas.



So I was thinking, outside of a physics engine, and scene recognition. what could be done in incremental advances. One thing is a simple scene recognition. Basically you'd have a laser pointer which scanned off everywhere in every direction and gets ranges. By plotting the ranges, it could determine basic 3d geometry around itself. If your algorithm moved around based on simply not going in any place light can't go, you might be able to navigate roads for example. Another place you might be able to navigate is people's lawns and golf courses. My guess is that self driving cars use this technology like a pro. I know there are laser range finders on them, so they should definitely be able to see if vehicles are occupying lanes around them. I wrote a general forum post on how self driving cars will change logistics for grocery stores and retail.

So laser range detection might be one thing to explore to get a 3d map in a computers brain. Once you have a 3d map, you can do basic things like navigate. You could even do object recognition for a limited subset of objects with a camera addition. So you can navigate and look around for certain key items. If you can make a robot that can fetch certain objects and bring them to a location, that isn't too fancy, but it is a possible minor step towards AI. After all, IBAI is simply a system of knowing the world around it, how it can interact to change the world, and goals it is attempting to achieve. Advanced IBAI might know more of the world, might have more data on the objects it is looking at, but simple range finding detection and object recognition could be an incremental approach to build it.

Lets look how you could build up to advanced IBAI with laser range finding + limited object detection. First you build the laser range finding to know a rough estimate of a 3d world you're in. You have no data on the quality of objects around you however, for now they're just obstacles. Knowing where your obstacles around you might let you navigate some environments, so that is pretty big there. Next you need a general object recognition software. You'd have a 3d model of the object you're looking for, and a video camera. If the video camera detects the object via its colors and matches it, you could know where the object is in 3d space in addition to all the walls that the laser range finder found.

You could either build a robot body able to interact with the object at this point(At this point you're probably just holding the range finder and camera device with no robot chassis) or you start databasing in more and more objects. The more objects you make, the more information the ABAI starts to get about the scene. This actually is much more reasonable than making a physics engine to start. You could make a good physics engine down the road, but to start, laser range finding + object recognition is all you need. Then it will incrementally get better as you add more and more objects.

Okay here's one... If you had the ability to add objects manually to the database without a human spending time in a 3d modeler. Maybe have the robot be able to "pick up" and examine an object. Before making any sort of body, simply just put it in a box either suspended by a wire or lifted up by a base. Then have the lasers detect the object's 3d properties while a camera "paints" the object with its colors. Yes... that would be a quick way of databasing objects alleviating a great amount of headaches

To get it down to basics, the laser pointer can only detect a point. The laser pointer knows where it is at, and its 3d aim, so it knows where the point is based on the distance it fires. One point doesn't make a mesh though. It would need to acquire many many points from different angles, and then the software would take all these points, play connect the dots, and make a mesh out of it :) So yeah, advanced IBAI is daunting in terms of programming tasks, but a laser range finding device that scans an object is something a small team could do with a little effort.

So the technologies that are needed to start this off are:

1) A quality laser range finder. Proper engineering to know the precise location of the laser range finder in 3d. I guess the best way to do this is to buy a 3d laser range finder...
2) The software would need to take 3d laser range finding points of an object and make a mesh out of it. It would also use a camera to get color information.
3) A similar piece of software would take 3d points in an open area and detect walls around you... This is actually a studied field of AI now that I should read up on.
4) A piece of software that takes meshes from 3d meshes found from 2, and identifies them in the wild.


Okay, that closes this blog entry. I think this is a pretty huge blog entry because it reduces AI's beginning to a problem a single guy, or team of people could do with some help from people on the Internet.

12/17/2013: Blog Entry 3: Laser range finding + Optical vision combo?

I stuck this down here cuz I don't want people bored reading to start. This site isn't good as a blog because I want to get info out there easily.

I wonder if there is a way to use optics to see a surface like a wall, and go,"I believe this is a wall the way I'm looking at it.", then throw the laser range finding to pick a couple points off the wall to verify. I think this might be legit. In Blog Entry2, I talked about a laser range finder scanning for a large number of points and meshes to play connect the dots on range finding alone.

Whatever is the case, I see technology naturally progressing to these solutions. I predict it won't be long after scene recognition becomes in that artificial intelligence will see its golden era. Once you have a robot who has AI that knows where things are located spatially from it, it will be able to do any number of functions you can program with the robot's body's ability. Any number of different body styles of robots will be able to use the same general AI software, but will have different ways to interact with the world.