Good Forum Footprints Mini List for Hrefer
The SEO Bay Xrumer Forum and SEO Community.
+ Reply to Thread
Page 2 of 4 FirstFirst 1 2 3 4 LastLast
Results 11 to 20 of 39
Like Tree1Likes

Thread: Good Forum Footprints Mini List for Hrefer

  1. #11
    Praetorian GODOVERYOU is on a distinguished road
    Reputation:
    41

    Join Date
    28th Mar 2011
    Location
    Bunker in Cleveland
    Posts
    206
    Thanks
    18
    Thanked 41 Times in 26 Posts


    I've got to disagree....

    In Addative Words, you want to put your forum footprints. Its ok to use inurl and other operators here.
    Additive words get attached to each search engine independent of the search engine being used. Very few search engines use the inurl operator, and Google, the primary target for it will flag the proxies IP address when used even conservatively. The end result is the Google will ban the proxy's IP address.

    *That is why most public proxies you find are already banned by Google despite testing fine with most proxy checkers. Google has banned the IP for suspicious operator usage and other people are using those same proxies.*

    You will waste bandwidth using an operator on search engines that don't have the ability to produce results based on it, and Google will ban the IP's. The INURL is fine SMALL scraping projects of very few keywords, but that's about it.

    Additive words SHOULD be used to identify the types of resources (or CMS'es) that you want to pull results for the keywords FROM. IE - if your keyword was cat, you would want cat+forum, cat+blog, cat+guestbook - you would not want cat + profile.php - you should not use additive words as a place for footprints. That's what the Seive filter is for - to identify potential linking targets based on pre-determined successful footprints.

    Now, I was being general with the example of forum and blog. Really you would want to put "powered by wordpress" "powered by CMS" etc., for every CMS you know you can hit BASED on the research you've done on your previous success lists.

    Additive words = CMS identification.
    Sieve filter = CMS url footprints.

    So put whatever footprints you want in Addative [additive] words, but put VERY SPECIFIC ones in the sieve filter.
    I have to disagree with that as well.

    Why have Hrefer scrape for footprints that the Sieve filter won't allow you to use anyways? Wasting time and bandwidth.

    I've written a tutorial on hrefer in the past, but the bottom line is the the sieve filter is ONLY to be used to "Pick" targets based on command line footprints. The Additive words filter is to be used to get search results for the CMS's those footprints come from.

    WARNING: If you load up your additive words with tons and tons of footprints that your sieve filter won't let you use anyone, you will spend forever scraping a ton of duplicate results as each additive word produces a new query.

    If you use INURL excessively 99% of the bandwidth sent to Google will be wasted. 100% of the bandwidth sent to the other se's will be wasted as they CAN'T USE THAT OPERATOR.

    Last edited by GODOVERYOU; 08-04-2011 at 17:37.

  2. The Following 3 Users Say Thank You to GODOVERYOU For This Useful Post:

    Meatytreats (07-06-2011), NellyBob (09-05-2011), trulysuck (20-02-2012)

  3. #12
    Shaman NGtheTRUTH has a spectacular aura about NGtheTRUTH has a spectacular aura about
    Reputation:
    171

    Join Date
    21st Mar 2011
    Location
    Sunny California
    Posts
    428
    Thanks
    36
    Thanked 171 Times in 54 Posts
    Thanks for clearing this up for me. Ive been running my Hrefer scrapes this way and have been getting good results thus far.

    I definetly misspoke about the inurl in addative words however, Kensai taught me that and i got my wires crossed (hangs head in shame)

    Im confused about my method not being great for the sieve part though because when i look at the results i have so far on the scrape im running now (6 million urls so far about 25% done) about 90% are good working forums.

    Can you expand on this a little more im just trying to understand. Im seeing results now but should i be seeing alot more results if i switch my focus to this method?

    Appreciate it

    NG

    Check out my [Only registered and activated users can see links. ]
    Get my [Only registered and activated users can see links. ]
    Beat the SEO Learning Curve - [Only registered and activated users can see links. ]

  4. #13
    Praetorian GODOVERYOU is on a distinguished road
    Reputation:
    41

    Join Date
    28th Mar 2011
    Location
    Bunker in Cleveland
    Posts
    206
    Thanks
    18
    Thanked 41 Times in 26 Posts
    The seive filter is probably fine, but the issue is filling up your additive words with a ton of additional footprints. If you are filtering by hostname, there is going to be a LOT of wasted time and energy.

    What ends up happening is if you have a keyword list that's 500 keywords long, and you add 100 additive words - you effectually have a 5,000 word long keyword list. No big deal.

    But, for instance, my sieve filter for master run's is about 12,000 footprints. If I was to follow the advice of putting those into my additive words file, I would have 500 keywords x 12,000 additive words for a total effective keyword list of 6 million individual queries the hrefer will make.

    Sounds great right? The problem is that those 6 million queries are based off 500 individual phrases. If you filter by hostname, each individual hostname may have 40 or 50 different footprints indexed by any given search engine, so the search engines are going to spit that same hostname out to your 40 or 50 times or more PER KEYWORD. That's a ton of duplicates you are collecting when in the same time you could be collecting unique links.

    In addition, you are going to spend all that time working through a keyword list 6m long, when in reality, you are only producing results for 500 keywords.

    Basically, if you load up the additive words with a TON of footprints, you are going to waste a lot of time collecting the same hostnames over and over and over.

    The idea is to collect as many unqiue hostnames as possible, so you need to draw a line between teaching Hrefer to best way to identify sites where you can leave a link (using the Sieve filter) and the fastest way to get to those hostnames (with additive words) while at the same time, not waste a bunch of time and bandwidth recollecting the same hostnames over and over and over.

    That's the reason there are two seperate text files, one is a filter, one is to append search queies. They can be used to do the same thing, but they are best used to do their individual task.

    Like I said, for any give hostname, Google could have 40, 50 or more footprints indexed for it. If you search each footprint using it as an additive word, you are going to waste 39 or 49 threads collecting the same place time and time again.

    Think of it this way - Sieve filter doesn't add any time to your scraping - it just enables you to collect more links based on places you know you can post to. The Additive words helps Hrefer find them, but if they "Overlap" too much, you will scrape exponentially more duplicates each and every time any more than two footprints for the same CMS are used in the search query.

    Where as the additive words could even be considered optional, the Sieve filter is absolutely critical.

    Last edited by GODOVERYOU; 08-04-2011 at 19:40.

  5. The Following 3 Users Say Thank You to GODOVERYOU For This Useful Post:

    llookk (27-02-2012), Meatytreats (07-06-2011), trulysuck (20-02-2012)

  6. #14
    Shaman NGtheTRUTH has a spectacular aura about NGtheTRUTH has a spectacular aura about
    Reputation:
    171

    Join Date
    21st Mar 2011
    Location
    Sunny California
    Posts
    428
    Thanks
    36
    Thanked 171 Times in 54 Posts
    Alright got it that makes sense. I appreciate you shedding some more light on that. I did some toying around with all three and thought i had the best settings going because i was collecting a lot more links than i was.

    So essentially the way im doing things right now isnt necessarily wrong, but its so in-efficient that it may as well be lol.

    Thanks again i appreciate it this will definetly help me scrape more effectively. Guess id best stick to my strong suit of SEO huh? lol

    Its alright tho ill get there

    Check out my [Only registered and activated users can see links. ]
    Get my [Only registered and activated users can see links. ]
    Beat the SEO Learning Curve - [Only registered and activated users can see links. ]

  7. #15
    Praetorian GODOVERYOU is on a distinguished road
    Reputation:
    41

    Join Date
    28th Mar 2011
    Location
    Bunker in Cleveland
    Posts
    206
    Thanks
    18
    Thanked 41 Times in 26 Posts
    No big deal. I'm not trying to come in like a bull in a China shop just so you know. There's a lot of ways of explaining what the two lists do, but they are so related that making any kind of meaningful difference in explaining them is often tough to do.

    If you had a keyring full of keys that will open doors for you to leave your links - the Sieve filter would be the keys on the ring and the additive words are a map to those doors. Hrefer says "Hey, I've got a key that will fit into that door!" when it's Sieve filter matches, and finds the door by looking at it's map.

    If the map keeps leading it to the same palce over and over, well, fewer doors get unlocked


  8. #16
    Deckhand xiaowoniu is an unknown quantity at this point
    Join Date
    27th May 2011
    Posts
    13
    Thanks
    2
    Thanked 0 Times in 0 Posts
    Do you have guestbook addtive words for hrefer? thanks very much


  9. #17
    Deckhand tmssj2000 is an unknown quantity at this point
    Join Date
    28th Apr 2011
    Posts
    43
    Thanks
    4
    Thanked 1 Time in 1 Post
    In that sense, would the strategy be to:
    1. scrape Google with all my forum footprints with a bunch of keywords (not too many) and no sieve filter , then
    2. try to post to them, obtain the SUCCESS/PROFILES
    3. analyse my SUCCESS/PROFILES for URL footprints
    4. place the URL footprints in my sieve filter
    5. punch in all the keywords i can find into the Words section
    6. scrape Google again...and keep repeating the process?


  10. #18
    Deckhand Nikats is an unknown quantity at this point
    Reputation:
    1

    Join Date
    24th May 2011
    Location
    Japan
    Posts
    17
    Thanks
    9
    Thanked 2 Times in 2 Posts
    Quote Originally Posted by GODOVERYOU View Post

    Additive words = CMS identification.
    Sieve filter = CMS url footprints.
    Just to clarify because I couldn't find anywhere else, is this really ONLY from the URL footprints that sieve filter is for? This thread is really good, wish I could find your previous tutorial GOY, was it on this board? Also 12,000 footprints!!! I've got some harvesting to do here.

    Last edited by Nikats; 07-06-2011 at 04:10.

  11. #19
    Deckhand tmssj2000 is an unknown quantity at this point
    Join Date
    28th Apr 2011
    Posts
    43
    Thanks
    4
    Thanked 1 Time in 1 Post
    Quote Originally Posted by tmssj2000 View Post
    In that sense, would the strategy be to:
    1. scrape Google with all my forum footprints with a bunch of keywords (not too many) and no sieve filter , then
    2. try to post to them, obtain the SUCCESS/PROFILES
    3. analyse my SUCCESS/PROFILES for URL footprints
    4. place the URL footprints in my sieve filter
    5. punch in all the keywords i can find into the Words section
    6. scrape Google again...and keep repeating the process?
    kensai, am i right on my strategy?


  12. #20
    Big Boss kensai is a name known to all kensai is a name known to all kensai is a name known to all kensai is a name known to all kensai is a name known to all kensai is a name known to all
    Reputation:
    635

    Join Date
    8th Dec 2010
    Posts
    2,221
    Thanks
    378
    Thanked 638 Times in 309 Posts
    @tmssj2000: That is one of many strategies you can use to scrape google for linklists and forums with hrefer and xrumer.
    The point is to use xrumer, analyse success, get url footprints, put them in sieve filter, put keywords in additive words and scrape.
    rinse and repeat. But the next time keep your sieve filters in place, as having none in there will produce massive lists with a lot of no good urls. So even for starters you might want to have a basic sieve filter. And by the way, always clean your lists before running in xrumer, both via xblack and via dupes removal, Especially if you dont use sieve filters...



 

Visitors found this page by searching for:

powered by expressionengine inurl forummemberregister

xrumer forum footprints

powered by expressionengine member photo does not exist

powered by SMF engine light

forum footprints

inurl:forum intexthtml code is on

inurl: forum intext:html code is on

inurl: html code is on

list of forum footprintspowered by expressionengine member registerfootprint list hreferhrefer expressionengineinurl: html code is on .compowered by SMF 2.0 the best internet search enginespowered by SMF 2.0 engine lightpowered by SMF 2.0 fact or opinion examplespowered by SMF 2.0 check enginepowered by expressionengine forums registration pr 3hrefer footprint extractorpowered by expressionengine inurl forums member registerinurl:forum intext:html code is oninurl: forumpowered by SMF 2.0 no check engine lighthrefer sieve filter listinurl:forum powered by smf coffee

Tags for this Thread