Monday, February 15, 2010

PostTwiceDaily2 02/15/2010 (p.m.)

  • tags: no_tag

    • ((solve (make-tsp-problem)))
      ((compare-search-algorithms 'make-tsp-problem
      '(A*-search greedy-search uniform-cost-search)
      :n 5))
  • tags: no_tag

    • Let's assume everyone knows A* finds (one of) the lowest cost solution(s) when it has an admissible heuristic (not that it's important here, but in symmetric environments, such as path finding around a symmetric obstacle such as a circle or diamond, there are two lowest-cost paths, say to the left or right, and there's no specification saying which of these A* will find)





      Maybe i should explain the goal of the question. The important part of this question is really just subquestion 4. But i was worried that, if i had just asked that, people would have hard time getting started on the problem. So the first three parts are supposed to walk you through the process of thinking about the right things (posting the search tool was, however, perhaps too big of a hint). They aren't intended to be hard or trick questions. Plus it gives you a chance to show off your thought process (since you'll write a one or two sentence explanation for each part), letting me give you partial credit if you can't figure out the last part





      The first parts of the question are building up to your final answer, which is that "A* does well on problems of type X and poorly on problems of type Y because A* has the property Z which interacts with these environments as XYZ"
  • tags: no_tag

    • i've been asked to explain what question six on homework 2 is asking for. i think it'll help to define a few terms





      TSP - you're a traveling salesperson who goes city to city selling brushes door to door. Unfortunately, you're a really bad salesman and every place you visit results in an angry mob and arrest warrant, guaranteeing that you can never, ever visit that city again. So your goal is to go to every single city in your region (i.e., all the nodes/vertices in the graph) without ever visiting the same node twice. There are probably multiple ways to do that but the best way is the one that puts the fewest miles on your car. So if you want to visit (and then be run out of) LA, Chicago and New York, the path LA->Chicago->NYC is probably better than Chicago->NYC->LA. Oh, one final rule - you have a start city (home) and when you've visited everywhere else, you have to go home. Unlike the other cities, you're allowed to visit home a second time (because your mom has forgiven you), but only after you visit every single other city first.





      MST - if you haven't seen a minimum spanning tree before, go look at Wikipedia. Here's how it works - imagine you have a graph with lots of nodes/vertices that are connected to lots of other nodes/vertices. We probably don't need all those connections (called edges in graph terminology), but we need at least enough so that we can get to each node. In other words, we don't need five highways to Chicago, but we need at least (and in MST, at most) one. So we find the ones we absolutely need to connect to every single node and that forms a tree, which we call the spanning tree. There's probably a lot of those, and if you add up the total cost to travel every single link of those, you'll find that some cost less than others. Whoever costs the least is the MST. Note that the MST is NOT the same as the TSP. There's a critical difference, but i'll leave it up to you to re-read the description of the TSP and think about trees (or look at the MST image on Wikipedia to figure out what it is; the two nodes in the middle will probably give it away) and figure out what that difference is.





      Relaxed problems - like a real problem, but easier. There are a couple of rules about how you have to do things and they affect the length of your solution (cost of path found). A relaxed version of a problem gets rid of a rule or otherwise makes it simpler. And why do you care? Because if you relax it the right way, it's easy to figure out the cost of the path (to the relaxed problem). And why is that good? Because you can use that as an admissible heuristic - assuming you didn't do anything goofy, the solution to the more complex problem has to cost AT LEAST as much as the solution to the simpler problem. And since we want heuristics to underestimate (since it forces us to check other close options to make sure we didn't miss anything), that's a good thing





      Heuristic estimate - in an intelligent/guided search, the heuristic is your guess about the remaining cost for any given path. When the kids scream "are we there yet, are we there yet, how much further?", it's your guess about how much further you have to go. You can either underestimate or overestimate but if you overestimate, that's bad, so don't do it. For an admissible heuristic (which you need for an optimal algorithm), you need to guess low. But you do it the same way you play The Price Is Right - the heuristic who guesses closest to the actual cost without actually going over wins. And in this question, you're trying to win





      Comparison question - The purpose of question 6 (3.30 in the book) is to find the very bestest heuristic cost estimation function you can for the traveling sales person problem. Here are some possible heuristics:





      h = 0. This one always underestimates. But you'll end up searching most of the tree.





      h = straight line. The shortest distance between two points is a straight line. But unless you have an SUV that can jump canyons, swim through rivers and plow through corn fields, chances are you'll never actually drive a straight line. (You see how straight line/Euclidean/L2 distance is an admissible heuristic that is always <= actual cost, right?). For THIS question, the straight line heuristic means the distance from whatever node you're on back to the start/end city. It is NOT the sum of the of the straight line distances to every remaining city.





      h = MST. Say you have a spanning tree with 10 nodes and a link (edge) cost of 10, meaning the cost to travel every segment of the MST is 100. At the start of the problem, h=100. If you visit five of the nodes, your h=50.
    • Finally, the question





      Part A - i want two things here. First, explain how MST is a relaxed version of TSP. Second, explain how a solution to MST is an admissible heuristic. Both questions boil down to the same thing - explain why the value MST gives you is ALWAYS less than or equal to the actual cost. Here's an example - you've got a graph with a billion nodes (which you would if the problem was placing circuits on a processor) and an MST for that graph (unlikely since MST is a hard problem but hey, it could happen). You've visited all but 700 of the nodes. You look at those nodes on the MST (you get the subgraph, which is itself an MST) and count up the cost of all the links there. The cost is 500,000. So your heuristic returns an h cost of 500k. To be an admissible heuristic, that needs to be an underestimate (or dead on). You need to explain why the ACTUAL cost to visit those cities is AT LEAST 500k. So, in summary, explain how MST is like an easier (but still relevant) version of TSP and explain why we can use it to create an admissible heuristic (meaning always, not sometimes, but always underestimates)





      Part B - Dominate means it always beats it at The Price Is Right, meaning it is ALWAYS a larger number while still being an underestimate. So for this question, you need to show that the MST estimate will always return a higher (or equal) cost estimate than the straight line heuristic





      Once you've finished with parts A and B, you'll have proven this:


      actual cost >= MST heuristic >= straight line heuristic


      for the TSP problem
  • tags: no_tag

    • i assume you're referring to the admissible heuristic for A*. In practice, it's hard to find an admissible heuristic that isn't monotonic, so people tend to assume an admissible heuristic is also monotonic, although that's not absolutely required. For this particular problem, i don't think it matters, but if you think it should, you're allowed to make that assumption IF you explicitly state it (that's a general rule for assumptions in the class - stating assumptions saves grades)




Posted from Diigo. The rest of my favorite links are here.