Found a good thread on WebmasterWorld today where several people are doing some more extensive research on the sandbox. The nice part is they not looking at specific search engine result pages to obtain data nor or they focusing on a particular data set (such as age or links etc only) instead they are covering many different areas where potential reason could apply. Now while the aggregate data presented is not really clear its a nice to look over and I will list the variables they are using to obtain the data. One weakness I see is that sites are not required to fill in all the information for each variable as well as how do you classify a backlink "agressive". It could be up for interpretation.
Here is what they are looking at currently:
Month Domain Registered Month Uploaded Month Indexed Number of Pages Number of Backlinks Method of Backlinks Anchor Text Adsense Adwords Content Level of SEO Pages are Static / Dynamic? Dmoz?
I am not exactly sure if this will result in a reason for the sandbox, nor intended to entirely but more or less identify areas that be used in further study.
One of the members mentions that outgoing links could also be a cause for concern. Here claims to quote Google policy here:
you cannot be damaged by incoming links only by dubious outgoing links.
However I don't follow this interpretation, as I can NOT find anywhere on the google website that says that exact phrasing.
However I did find the following in their Quality Guidelines section:
Don't participate in link schemes designed to increase your site's ranking or PageRank. In particular, avoid links to web spammers or "bad neighborhoods" on the web as your own ranking may be affected adversely by those links.
Additionally, they go on to say:
Google listings are based in part on our ability to find your site by following links from other web pages.
So I don't know if I follow the outgoing links will hurt you idea, but I do know from cleaning up many Google penalizations in recently, they can and will temporary penalize your rankings for links that do look of fishy nature. One in particular were footer links on image pages that got a site penalized. When the link were removed the rankings came back. I guess its up for interpretation, but I wanted to make clear of what was in those posts on WMW.