For years the headlines have been about open source cannibalising proprietary software. But what happens when open source starts to cannibalise itself? In some markets, open source rules the roost. For example, Drupal, Joomla, my old company Alfresco and other open-source content management systems regularly duke it out for …
Unfortunately, as well as eating itself, it also splits and forks itself. Web servers has always been a FOSS stronghold. Everywhere else that counts is full of crap competition: KFE vs Gnome - developer show-offs leading to no winners. LibreOffice vs OpenOffice - no winners due to diluted development.
Yes it's one of the best things that you can fork a project if it heads in the wrong direction, but it also dilutes development effort trying to do 10 things at once.
Agreed. The benefit of FOSS is the community, it's also the achilles heel. There are so many projects out there that I kinda wish would merge together, if not for the sake of more support, just so we don't have all these choices which are pretty much identical.
"LibreOffice vs OpenOffice - no winners due to diluted development."
Given that Oracle seems to have an Apple like control freak complex, I fear that there's a good chance that if the devs. had not forked OO.o we wouldn't have a choice at all.
See also: Hudson/Jenkins and also MySQL/MariaDB.
I guess you missed where Oracle handed over OpenOffice to Apache.
"I guess you missed where Oracle handed over OpenOffice to Apache."
Only after LibreOffice forked and the original suffocated. It was limping on at that point and might have limped on under Oracle's thumb for quite a while had there been no alternative place for the developers to contribute code.
@wowfood - Don't want choice ?
Then just go Microsoft or IBM and poof, the choice is gone. See, I solved your problem so leave all those projects alone for the rest of us.
Another Achilles heel is the attitude of the developers.
You can raise bugs that are genuine with Apache using their bug system and have it closed with a status of "WONT FIX" with little or no justification other than the fact that they don't want to change something as it may break another feature. So much for the "many eyes" approach of ensuring code will work and is secure.
KDE vs. GNOME
That particular competition has kinda produced results. Qt is the one true multiplatform GUI toolkit, and KDE was much easier to develop for.
Aren't the licenses different too? Apache vs GPL?
It appeared to me that the handover to apache was Oracle being snarky because the devs forked to LibreOffice. Apache sponsors gain from the apache license (for incorporating a webserver into proprietary products), but I doubt there is much call for people to embed an office suite into a non-gpl product.
Re: KDE vs. GNOME
Also, when one GUI dev team goes off the rails, you can switch until they get their act together again, or you can mix and match.
For example, you could run a gnome version of firefox together with a KDE desktop environment, or k3b under the LXDE.
Personally, I'm glad to be able to avoid Gnome3 on the desktop but I like KDE for bells & whistles and LXDE for low-power, low-function scenarios.
Wastefulness and vitality seem to go hand-in-hand in IT as in other areas of life.
Redundancy = Resilience
"you can fork a project ... but it also dilutes development effort "
Redundancy enables resilience and evolution. Think about DNA, meiosis, non-lethal mutation and natural selection. The parallels are forking, innovation, sharing, competition, and progress.
KDE vs Gnome
Gnome might be self imploding, but i very much like KDE and find it to be very functional. So i reckon your wrong there.
Good forks merge down the road (or their codebase is pulled into the trunk, or into someone else's trunk.)
Bad forks wither and die.
It's always been like this in opensource.
Re: Redundancy = Resilience
You missed horizontal gene transfer - this is opensource's greatest asset.
Re: Redundancy = Resilience
Hmm, phage assisted sex. Mammals have more fun!
I just wish the presentation software on my Mac didn't crash and destroy my work. Stopped using it then upgraded and it still went boom. The spreadsheet and word processing are fine. I did buy the Apple office equivalent and it's fine and about a tenth of the price of MS Office, but its a bit too ... Apple. You can't save as a different format but have to export, things like that.
Also has an unmentioned use of rescuing broken MS Office documents - if you have a document that won't load in the MS tools LibreOffice will usually help you at least get your content back :)
I would also quite happily part with a few Euros to help pay for dev, e.g. the broken presentation piece, but there isn't a mechanism.
on the performance front. I've been playing around with a home server for a while, and I've tried running a couple different setups
IIS, Apache and nginX. I plan to try Hiawatha soon also.
What I found with IIS and Apache running on a fairly dated home PC was that they were both pretty poor to respond. I admit a lot of this was from connecting via my homoe network. but it was taking a few seconds to load each webpage. and these were pretty basic.
Popped on nginX to give it a try, and even then it was noticeably faster.
Reason I plan to try Hiawatha next is simply because of security. Playing with a web server on my home connection isn't the greatest idea in the world, I'll admit instantly I'm not a web admin, I don't know much about it, hence why I'm looking into a far more security driven server architecture. Depending on resource usage, how secure it actually is and how much power it draws, I may wind up hosting for myself using it.
Commentards! I beseach thee, know of any additional measures to secure a home web server, whether it be router level, OS level, or even a few tweaks inside the server itself if you know of any way to make Hiawatha slightly more secure.
Even on a Raspberry PI Nginx holds up to a pounding from Blitz.io
"What I found with IIS and Apache running on a fairly dated home PC was that they were both pretty poor to respond. I admit a lot of this was from connecting via my homoe network. but it was taking a few seconds to load each webpage. and these were pretty basic."
Which raises the question how you used them? Did you install them and used them "as is" or did you tune the critters? Because speaking from personal experience I can say that both server can easily run on older hardware (depending on the version) where obviously Apache tends to be lighter by default because it doesn't have appserver functionality embedded.
I've been running both on a 2k3 server, 2Gh AMD Athlon with 1Gb memory and it pretty much works as expected. No noticeable issues wrt. response time (IF you tune them).
That could have been it then, I was effectively going 'out of the box' setting up the basics of what I needed and leaving it at that. Mind you I was trying to throw it together in a bit of a hurry for a uni project that was coming up.
I ran an Apache site on an 500Mhz dual core IBM server with 612MB RAM. It ran without any issues and didn't come much above 10% CPU. This was using PHP and MySQL with only a few thousand visitors a day but still not bad nad very quick page load times.
Playing with a web server on my home connection isn't the greatest idea in the world
Learn how to set up a Demilitarised Zone (DMZ) on your network. Simply put, you make a separate subnet for your web server and use IP filtering rules (at your router) to allow machines outside that subnet to access it, but block all outgoing traffic (apart from responding to already-established connections initiated from other hosts). It can be as simple as three iptables rules: one default rule drops all forwarded traffic, one allows NEW connections to be forwarded to the DMZ box and a third allows packets that are ESTABLISHED or RELATED to be forwarded from the DMZ box. In practice, you'll probably want to do something more complicated, like doing NAT masquerading and port-forwarding at the router (so that all your machines appear to be at the same IP address and so that traffic coming from the Internet on port 80 is forwarded to the DMZ machine, respectively) so I can't give you the exact iptables commands or other firewall rules here.
Likewise, if you need to allow the DMZ machine to access certain services inside your network (that you can't or don't want to store on the DMZ machine) then you need to add more rules to allow it to make those connections. You'll want to lock down that service so that the DMZ machine can only do the bare minimum with it that it needs to operate without leaving a big hole in your security. Or better yet, migrate a minimal version of the service to the DMZ box itself or another machine on the DMZ subnet. There's always a trade-off between security (risk of the machine getting hacked) and utility (eg, you'd really like to be able to access your IMAP server) with any machine connected to the net, but a DMZ is a nice way, up to a point, to get the best of both worlds.
So basically, look up setting a DMZ for your particular router and learn about how to set up firewall rules in general.
Other than that, your distro should have packaged the web server to be pretty secure already, such as running it as a user with restricted rights (nobody in Unix-based systems) and maybe it also gives you the option of running in a chroot jail too.
Have a quick look at the Cherokee web server whilst you are at it; it's a rather interesting project.
The performance issues of thread-per-conversation (or process-per-conversation) are well-known. Indeed, they've been a commonplace in, say, discussions on comp.protocols.tcp-ip since at least the mid-1990s.
The advantage of TpC is that it's largely straightforward to implement and use, provided your request/response processing doesn't involve too many pieces of shared data that require synchronized access. This is even less of a problem with PpC, since the tasks don't share address space.
The real reason why Apache has a TpC model, though, is historical: it evolved from NCSA httpd. (The original Apache was a set of patches for httpd - hence the name.)
The thing is, most websites don't see that much traffic. And most of the ones that start to get a bit sluggish can just throw more cheap hardware at the problem - traditional web serving scales out nicely. That, plus Apache's feature set and established base of developers and administrators (which mean there's a large knowledge pool to draw on), mean Apache will continue to be the major player for a long time to come.
Matt uses market share as a metric to show Apache is "sliding", but in a case like this market share is largely meaningless. For a long time, Apache was hugely dominant because of the lack of other suitable choices for many customers. Now there are more choices, which better fit the needs of a significant portion of the market, so we're seeing a correction as sites that were never well-suited to Apache anyway move to Nginx or other alternatives. This is hardly news.
What's more puzzling is the thesis of the article, that "Open Source" is "cannibalizing" itself. For as long as the Open Source moniker has existed, there have been alternatives in nearly every category - Matt himself starts by mentioning some of the various CMSs. And obviously over time some gain market share (insofar as that means anything) at the expense of others; so this is not news either. And worse, the whole idea of "cannibalizing" only applies if all FLOSS projects are somehow bound in some kind of zero-sum mercantile economy, where the rise of one must be matched with the fall of another. That's a reflection of a naive dualist metaphysics of the software market, partitioning it into "open" and "proprietary" as though the two were antipodes.
Isn't just about speed.
I've used lighttpd and now nginx because they are small and simple and yet still do everything I need them to. Apache seems a wee bit bloated and inconvenient by comparison.
MySql has less features than a proper database, agreed. Being a proper bloody database, for a start. However, it certainly isn't faster by any measure at all.
Oracle, PostgreSQL and SQL Server all make it look like it's emulating a 286 to run on.
Not when I did like-for-like tests with Oracle & MySQL, some 6 years ago (MySQL 3.23!). MySQL was at least twice as fast for most things.
If you don't need bomb-proof transactions (and most batch processing doesn't, with suitable design) MySQL is a very good option, with a powerful set of functions built into its SQL. It's also extremely reliable, so long as the power's on. It's also about 10 times quicker to restore if/when your hardware croaks.
Comparing MySQL with no recovery on to Oracle with full recovery is not a fair or useful comparison. If I compare Sybase running without recovery, or better still with an in memory DB it outperforms MySQL many times over as well as having more features.
I liked MySQL and used it a lot before Oracle bought it but I always use Sybase or MS SQL Server for bigger projects.
MySQL vs Oracle?
Puleeze. Get a real RDBMS in Informix's IDS now under IBM.
Even with IBM's inept management of the product there's a rabid core following and for a good reason. ;-)
Its more ORDB than its competitors and is still the fastest when tuned properly.
There's even a scaled down free version...
MySQL has its uses - Is there a problem with that amongst the cognoscenti? Well I are one too. I like to use the right tool for the job, and for some jobs, in my experience, MySQL is that tool.
If I had a 1000 clerks banging in cheque details, I'd use a different database, but since I specialise in data cleansing & manipulation for batches of a few thousand; I appreciate performance, but can fix failures, so I often use MySQL - still version 3.23, I might add. As stated above, MySQL is extremely reliable, and I have not seen any need to migrate the Perl / MySQL "engine" I wrote seven or eight years ago to automate the running of these jobs, which have recovered much moolah in return.
What a refeshing change.
"fighting for market supremacy in the only way open source really knows how: technical merit"
Makes a change from all that bloody litigation that's going on.
Re: What a refeshing change.
"Because both have to appeal to developers, and developers have a low tolerance for marketing speak."
Oh so true. Which is yet more proof that FOSS is a good idea (TM).
Don't replace the king, replace monarchy
Completely agree, Matt. Fragmentation might have its drawbacks, but diversity - the other side of the same coin - is absolutely essential during the disruptive phase. Just yesterday, I saw yet another post about how a particular technology area (in this case storage) lacked a dominant open-source technology. I've bemoaned the lack of any such alternative myself many times, but I disagree with the author about the desirability of having a *dominant* open-source alternative. I think there should be *many* open-source alternatives, none dominating the others. They should be sharing knowledge and pushing each other to improve, giving users a choice among complex tradeoffs, not delcaring themselves the new "de facto standard" before the revolution has even begun in earnest. We don't need another Apache or gcc stagnating in their market/mindshare dominance until someone comes along to push them out of their comfort zones. Being open source is not sufficient to gain the benefits of meaningful competition. One must be open in more ways than that.
P.S. wowfood, I've just started switching my own sites from nginx to Hiawatha, also mostly because of security. While I don't have any specific tips to offer (except perhaps one about rewrite rules that I'll blog about soon) you might be pleased to know that it's going quite well so far.
Two different things with some overlapping functionality
Apache Httpd is often used as an app server in LAMP style deployments, Nginx is often used as a load balancer / reverse proxy that sits in front of app servers. They have overlapping functionality but those would be their main specialities.
So sites might use them in conjunction, or not depending on what the site is trying to do, the type of content its service (dynamic / static) and what is generating the dynamic content.
Re: Two different things with some overlapping functionality
Agreed - though I've seen them in conjunction in presentation/app tier arrangement. Use nginx to proxy and serve static content (very quickly) and save apache, which does have a larger resource footprint, to handle dynamic content. (eg serve static html, images, js, css, etc from nginz and then just use apache to handle the rest).
As you say, the design of the site dictates how much static content you can hand off to nginx, but in some situations, it's a useful tool to use.
"But web servers? That's a market that Apache won ages ago, with no open-source competition to speak of."
Uhm, hate to break it to you, but Apache IS open source.
It's name is based on the fact that it was originally NCSA HTTPd, but had patches applied to it. Journalism, please!
Re: What the...?
Your Fail is a failure. He knows Apache is open source, what the sentence means is that there was no OTHER open source software that competed with it. The entire article is about competition between different open source implementations.
Re: What the...?
NCSA httpd, Lighttpd, nginx are all over 10 years old.
Re: What the...?
" with no open-source competition to speak of." != " with no open-source competition"
Apache is not a synchronous web server. Apache has a mode of operation that is synchronous. It also has an asynchronous mode. In it's asynchronous mode, it is just as fast as nginx, yet supports many more 3rd party modules.
Apache 2.2 ships, by default, in synchronous mode. Why? Because Apache is commonly used to make a LAMP stack. PHP in the form of mod_php historically does not play well in a threaded environment, usually due to it's extensions.
The solution is to run php-fcgi instead of mod_php when running asynchronously. This is actually better since it separates the PHP interpreter from the request handler, which increases performance. This model, php-fcgi and asynchronous workers, is exactly how nginx works, and the two are comparable in speed in this configuration.
So why isn't this the default configuration for Apache/PHP? Ease of upgrade. It is too confusing, say packagers, to ask people to change how they deploy their PHP apps on Apache, it cannot be changed. Also, the package will include almost every stock Apache module, and they will all be loaded by default.
So install LAMP on Ubuntu, and you get the slowest possible way of serving PHP, by design. Install nginx, and you get the fastest. This is where the lighty/nginx/New Cool argument comes from, people install the stock configuration and think Apache is some slow beast that takes all your RAM.
Apache, properly configured, is amazingly fast and light on memory. Plus, you get the entire ecosystem of Apache modules to use. There are many books written on Apache module development, and thousands of books on Apache configuration and howtos.
Finally, about web servers. Web servers are an amazingly popular bit of software to write. It's so simple to do, that they massively proliferate, each claiming to be the fastest most agile web server going - I'm looking at gunicorn, Tornado, et al here.
I'm not going to comment on their speed, but instead the speed of the thing you are serving. Frankly, how fast the web server does it's web server tasks is massively irrelevant in the overall scheme of things. Any request involving DB queries will swamp the amount of time the web server spends handling the request. Any request not involving DB queries is a static file, and should be served from cache or disk, which is a hard thing to do slowly.
There is nothing wrong with nginx or lighty, they are both excellent web servers. But so is Apache, and rumours of it's death are greatly exaggerated. If you already have Apache skills, changing to nginx means learning new syntax and gotchas, and losing all your experiences and custom modules, and it still won't go faster than your app.
tl;dr - use Apache 2.4, event MPM and php-fcgi.
So, clueless admins using a bad configuration are just as much of a problem under *nix as under windows, requiring the people writing the software to work around the "admins".
@Peter2 - That is correct!
However, working around the admins in Unix is way much easier and faster than it is in Windows.
U mean mat rote a article without checking his fax? it canot be.
(BTW apropos other comments, squid also can operate as a reverse proxy.
I also get the impression that http caching control headers are really minimally understood, if they're known about at all, by a large number of web designers)
Although Apache 2.4 came out 10 months ago it was not ready for production use. For instance: it took a few months for perl and php to work with it, then this all needed to be adopted by the distributions. This has happened to an extent (eg it is in Fedora & Debian experimental), but the main stable distros come out with a new major release every few years and won't change before then. So expect it in RedHat 7 (out later this year) and Debian 7 (maybe this year) or Debian 8 (2014+).
Don't expect most people to run it on serious production platforms until then. Even then, transition will be slow since we don't like to upgrade stable, supported systems.
2% Google, does that mean that Google have a special web server for their web servers, and that their web servers are 2% of the market, or does Google actually distribute a web server for other people to use?
They have their own, imaginatively called called Google Web Server or GWS
In the beginning were homebrew hacks. Then two webservers emerged as leaders: things you could download and could run more-or-less out-of-the-box. Both NCSA and CERN servers were open source.
NCSA begat early Apache, which ruled the world in the second half of the '90s. But by the turn of the century, people were doing better. Apache 1 begat three new servers: Apache 2 from the same development team (modulo its evolution over time), and lighttpd and nginx as independent efforts, all drew on parts of the architecture and codebase. But their focus was different: Apache 2 with a lot of powerful APIS became a highly flexible and extensible applications platform, while nginx and lighttpd focused on raw performance in a more limited task. nginx has for some time looked like the winner in the "lean and mean" space, but is not designed with extensibility in mind!
And so we have horses for courses. The open source of NCSA and Apache 1 served its purpose, as different developers incorporated bits of them into new and improved products, serving different needs. Though of course there's still plenty of common ground: the regular web server or proxy, where any of them meets the needs of the vast majority of users without sweat.