back to article Archive.org web trove hits FOUR HUNDRED BEEEELLION pages

The Internet Archive's "Wayback Machine" has announced that it has indexed four hundred billion web pages. The trove dates back to late 1996 and comprises at least fourteen petabytes, a figure we base on a 2012 declaration the archive hit 10 petabytes and a later post explaining that a fund-raising drive for another four …

COMMENTS

This topic is closed for new posts.
  1. Paul J Turner

    4 Billion pages...

    Let's hope that they didn't use 32-bit unsigned index values or they may find themselves more way-back than they planned!

    1. Annihilator
      Headmaster

      Re: 4 Billion pages...

      Four hundred billion...

  2. MrT

    Brilliant project!

    Can't remember visiting El Reg back then, but like most people who had anything online in the early days, it's a blast visiting stuff that no longer exists some of mine from '94 was still around for the earliest snapshot). I like the roll-back feature, showing how things have changed over the years.

  3. T. F. M. Reader

    Understatement of the year?

    Who had more beer last night: me, The Reg, or the Wayback Machine? Their announcement says FOUR HUNDRED BEEELLION pages. or at least that's what I saw. Twice.

  4. h3

    I think I first read El Reg in either 1999 or very early 2000.

    (Think it looked fairly similar to how it is now).

  5. mix
    Thumb Up

    I miss low res animated gifs

    Ahh, those pre-flash days, twirly gif logos all over everyone's page because they were 'dynamic' and 'cool'...

    * goes all misty eyed*

    1. Annihilator
      Go

      Re: I miss low res animated gifs

      Not to mention the yellow and black striped sign with "UNDER CONSTRUCTION" emblazened all over it.

      1. WraithCadmus

        Re: I miss low res animated gifs

        If you fancy putting one into your next web project here's a selection

        And if like the commentards below you miss the <blink> tag then CSS3 can help.

      2. MrT
        Stop

        Under construction...

        ... don't forget little graphics of envelopes with "No Junk Mail" on them, right under a plaintext email address ... because that stopped spam in its tracks ;-)

  6. JDX Gold badge

    even shows how El Reg looked as a young vulture in the summer of '97

    I prefer that version!

  7. Zog_but_not_the_first
    IT Angle

    HMTL of yesteryear

    </blink>

    How we've missed you.

    1. Anonymous Coward
      Anonymous Coward

      Re: HMTL of yesteryear

      "</blink>

      How we've missed you."

      You only miss it once it's gone. No really, no need to reimplement... please...

  8. SW10
    Happy

    Tip-top technical guidance from Ye Olde El Reg

    "Use the back key on your browser to return to the previous page..."

    You have to wonder who the target audience were!

  9. Khaptain Silver badge
    Holmes

    Remash of old material

    Ah ha, now we have proof that the El Reg hacks simply remash existing material.

    Win X Rollout

    El Reg hack replaces Win 98 for W8, changes one or two names, drops in the obligatory threat from company X who is "prepared" to move to an alternate OS et voila Bob's your auntie.

    Bill Gates was already the worlds richest man, for the 4th year running with an estimated 51 Billions - so we can easilly remash that as well....

    Interesting to see how nothing much changes.

  10. Winkypop Silver badge
    Thumb Up

    Ahh the wayback machine

    How many times have I used ye to retrieve old content long since deleted from our servers?

    Praise be to the wayback machine!

  11. Vociferous

    It's a sad testament to the state of the world...

    ...that "Gates Owns Even More of Everything -- Official" is now the Good Old Days.

  12. ProperDave

    I had a friend show me an odd 'bug' in the Wayback Machine once - he bought a domain and set up his own personal website on it, which as it turned out had already been owned several years previously to his purchase by a small foreign telecoms company.

    The TelCo had a blanket denial robots.txt file which told all spiders to F- off, and because of this, the Wayback Machine would refuse to allow him to browse the historical snapshots of the domain during the time he owned it, despite indexing his site according to his web traffic logs.

    I just shudder at what Wayback Machine holds on me - I can see my very first websites thanks to the history, back when I did terrible things like build websites in Lotus Word Pro (which was marginally less of a sin than building them in Word).

    1. Anonymous Coward
      Anonymous Coward

      I just hit the same 'feature'. My domain goes back on there to 2000, yet I can't see it due to the robot.txt thing. I just changed my robots.txt to allow all, and it still refuses to show me the archived pages.

      So I don't understand how/why this works like this?

      1. ProperDave
        Pirate

        Seems there's some discussion on it on the Archive.org forum;

        ( http://archive.org/post/406632/why-does-the-wayback-machine-pay-attention-to-robotstxt )

        Doesn't appear to be any sensible consensus on what they should do to fix this... but this is totally off-topic for this article. :)

        Pirate flag, as I've partially hijacked the topic! (we need a tangent icon).

  13. ecofeco Silver badge
    Pint

    Love the Wayback Machine!

    After several PC upgrades, I eventually lost the files that my were first websites I had built back in the gay 90s. Some of them quite good!

    Gone for ever I thought. Oh well.

    Spent a few years years trolling the Internet Archive thinking they might show one day. After many, many years, low and behold! They did!

    Now when I tell people I used to build websites back in the early days of dinosaurs, fire and stone wheel, I now have proof and you know what, they still look pretty good.

    Here's one sample. Check the date. https://web.archive.org/web/20010406065812/http://www.worldtv.org/index.htm

    Flame on. :)

    1. This post has been deleted by its author

      1. ProperDave
        Pint

        Re: Love the Wayback Machine!

        Check out http://www.fabricland.co.uk/ - it's like playing bad web design bingo!

        "New Page 1", Framesets, pointless gifs, horrific colours, marquees, table layouts, center aligned text, broken links, personal drawings/quotes unrelated to the site... the list is almost endless!

        ... I hope that site never changes... it's a fantastic example of everything bad *and* it's an actual live site! :o

  14. C. P. Cosgrove
    Thumb Up

    Early logo

    I wasn't a reader then, but I have to say that the early logo still looks quite stylish incorporating as it did both the 'R' and the vulture.

    Yes, a definite 'like'

    Chris Cosgrove

  15. theotherbill

    Old Copies of "As The Apple Turns"

    appleturns.com was responsible for many a noser back in the day. Where is Jack Miller?

This topic is closed for new posts.