Feeds

back to article Twitter survives election after Ruby-to-Java move

Micro-blogging site Twitter experienced record traffic as the results of the 2012 US Presidential election were announced on Tuesday night, but the service never faltered despite the increased load – something Twitter engineers credit to the company's move from Ruby to Java for its backend software. According to a blog post by …

COMMENTS

This topic is closed for new posts.

Page:

Bronze badge
Thumb Up

Not surprised

I thought Ruby was not so great as a practical language, and this proves it; also Ruby on Rails faces competition from Grails (Groovy on the JVM) too.

Scripting style code is fine for small tasks, but can be painful for maintenance and performance, as this illustrates.

6
8
Silver badge
Holmes

Re: Not surprised

Well, "Grails" is actually "Ruby on Rails using Groovy", and Groovy essentially _is_ a scripting language on the JVM.

I really prefer it to Java, too (though Clojure is sitting there like a nice cake that I cannot have ... hmmmm) and it can be used without much pain. It's slower but certainly not painful for maintenance if you stay reasonable. And you always should stay reasonable.

0
0
Anonymous Coward

Re: Not surprised

Dynamic languages suck for large systems. You can't refactor since the tools can't find all potential callers of the code your changing, and you have little confidence anything will work because the compiler can't catch the swathes of errors that a static language compiler can. you find yourself writing enormous quantities of unit and integration tests that are testing the inadequacies of the language rather than your logic - in other words, you're doing the work that would be left to the compiler and static anaylsis tools for a statically typed language. Convention over configuration sucks, since it makes it hard to look at any significant chunk of code and reason about the context it will work in as too much "magic" is going on. A similar problem exists with Aspect Oriented Programming in the Java world using things like ApsectJ, which is why is why this form of AOP should be avoided. The heavy reliance on reflection means performance sucks, and in the case of Grails the framework is a rapidly moving target of new and deprecated features were old bugs aren't fixed but plenty of new ones are introduced.

Do yourself a favour. Do not use Grails or Rails. Use a statically typed language and decent framework - they may not reek of cutting edge cool but they're mature, have good tools and encourage maintainable code.

12
2
Anonymous Coward

Re: Not surprised

Ruby isn't a bad language at all, but it is only very recently that decent implementations have surfaced such as JRuby, which gives you the concurrency and JIT capabilties of the JVM. The problem is (to my mind, anyway) the very easy to use and powerful metaprogramming features. They're very heavily used in Rails, and whist they make development super simple they seriously inhibit performance. Ruby off the Rails is a much more tractable beast, but a lot less popular.

Still, moving to Scala, eh? The Yammer guys built their stuff on Scala, their product effectively being "Twitter That Actually Had A Business Plan". Caused them no end of problems, and they migrated everything to Java. Be interesting to see if Twitter does any better in that regard.

0
0
Silver badge
Holmes

Re: Not surprised

> Use a statically typed language and decent framework

Oh I agree with that. It's just that rather often I barf in anger as I have to write the exact same code for different types or must enter the infinite verbosity torture chamber of boilerplate code (which introduces bugs in and of itself) that you don't need with Groovy.

Indeed, look at it like this: things that are errors in the statically typed language no longer are. It's like moving from a language without class templates to one with class templates. It really helps. Generally.

1
0
Silver badge

Re: Not surprised

Do yourself a favour. Do not use Grails or Rails. Use a statically typed language and decent framework

I use Grails, and I hate it being based on Groovy. When I started with Grails, I did everything in Groovy. Now, I do as little as possible in Groovy, with everything else being pure Java. I just got burned too many times.

If they could re-write Grails so that it was pure Java, I'd be a very happy chap.

0
0

Re: Not surprised

There is a bit of misunderstanding going on here. Their backend was not using Rails it was using starling https://github.com/starling/starling . Twitter's front end is still using Ruby on Rails as far as I know. Also, Jruby on Rails runs fine on the JVM. The basic mri implementation of ruby is constrained my they way it has been designed. Ruby on Rails is perfect for the problem it solves - starling on the other hand was not up to the task.

0
0
Silver badge

Re: Not surprised

I thought Ruby was not so great as a practical language, and this proves it

No, it does not. It's one data point. Your understanding of what constitutes "proof" is flawed.

For that matter, your understanding of what constitutes "a practical language" looks pretty weak too - "practical" doesn't have much meaning outside a specific context of practice. And let's not even raise the issue of what "scripting style code" might mean.

0
0

@infernoz - you don't know what your talking about. Are you suggesting everyone should start building their websites in Java in case of the off chance it goes global?! Java is faster, yes (duh?!) but is much more expensive to develop in due to the longer time required to write Java code. Github and Groupon seem to be scaling no problem dispite being Rails applications.

In fact, here Twitter admitted Ruby wasn't the issue:

http://highscalability.com/scaling-twitter-making-twitter-10000-percent-faster

Also - back in 2009 *after* they switched to Scala, they still topped out at 300 tweets a second (pretty pathetic really), so it has taken them 3 years to ramp up to nearly 10 000 tweets per second. Clearly there is much more work involved here than simply switching languages.

6
5
Anonymous Coward

@darkpill

As my dear departed parents used to say, "If a job is worth doing, it is worth doing well".

That generally means taking a little longer to produce the best outcome.

Significantly, it appears that in this case it would have been significantly cheaper for Twitter to have used Java in the first place. That way they would only have had to do it once.

7
6
Silver badge

I've worked in a lot of places hobbled by prototypes.

It can be quick and easy to prototype in something quick and easy but the number of places I've worked in where one or more mission critical apps are still basically stuck in the prototype version and more effort is being spent on maintaining that than would be required to build a full enterprise version.

In fact I know a few extremely large IT companies that still exist because people take that short-termed view at the start and are forever trapped in it.

8
0
Anonymous Coward

Indeed

I know in my particular industry there are quite a few little tools I hacked together in the wee hours, to get past a little lump of development inefficiency, that somehow managed to slip into production.

I'm never sure whether to be flattered or scared to death, when I get a call out of the blue asking if I can advise on how to extend something that was supposed to have been replaced way before testing.

1
0
Anonymous Coward

Also - back in 2009 *after* they switched to Scala, they still topped out at 300 tweets a second.

Yes, because at that point they still hadn't learnt their lesson - new languages may be cool, but they usually suck. Java sucked for years until the JVM was heavily optimised and lessons had been learnt with regard things like locking and immutability. It only survived because even when it sucked, for most applications it still sucked less than writing in C or Perl. Now we have generics in the Java language along with an excellent collections library and great tooling (apart from Eclipse - try IntelliJ IDEA or NetBeans instead).

4
0
Anonymous Coward

Should have used lisp from the start.

4
0
Terminator

Lisp

Prolog's the daddy.

0
0

Not really a problem for Rails

Plenty of large sites run fine on Rails and would have taken an age to write in Java. The very few sites that end up as big as twitter will always need to performance tune the hell out of the bottlenecks with whatever tools they think best for that VERY SPECIFIC job.

2
0
K
Bronze badge
Facepalm

As a PHP and semi-Java developer, I say..

Ruby.... LOL LOL LOL LOL LOL mwahahahaha die ruby, DIE!

1
9
Anonymous Coward

Re: As a grown up, I say..

Whatever tool is best for a specific job

4
0
Anonymous Coward

Re: As a programmer, I say..

Don't be scared of C or C++ for that matter. Can someone tell me why they used Java (and not why *you think* they used Java).

2
0
Silver badge

Re: As a programmer, I say..

Possibly because it doesn’t do scary like C++ does.

By scary I mean actually provide a complex solution to a complex problem in the minimum lines required if you get it right. The scary bits are often confusing and to the uninitiated pointless but if you make the effort to understand and use them you will realise they are the icing on the cake put there by someone who has made more cakes than your mum could eat in a lifetime and knows how to put icing on that would hold almost any cake together.

Its such a shame that its mostly un-maintainable in the sense that most highly qualified newcomers to the party can make lots and lots of iced buns but find making a cake scary and are kept too busy making buns to find out how to make the cake that will feed the 5000 even though it would only loose a dozen buns in production.

And managers like to gorge on buns. Shame its not lardy cake with added cholesterol - we might get the time to learn how to make bigger and better cakes while they have a triple bypass.

5
0

Re: As a programmer, I say..

Java has built-in concurrency and very good memory model (albeit now c++ caught up) . GC in a concurrent environment is a true blessing. Morealso due to dynamic compilation java allows you to deploy/redeploy modules w/o restarts.

Plus, java is significantly easier to write (and read) than C.

Yet, memory management (I.e GC) is a clear winner, IMO.

5
1
Anonymous Coward

Re: As a programmer, I say..

Don't be scared of C or C++ for that matter. Can someone tell me why they used Java (and not why *you think* they used Java).

I can tell you why they don't use C++. Go and read the "Effective C++" and "Exceptional C++" series of books, then try to tell me with a straight face that a language with so many gotchas is suitable for web application development.

2
0
Thumb Down

Flaming over a particular language?

Please just stop and go do some actual work.

1
1

Re: As a grown up, I say..

Whatever tool is best for the job, absolutely. You come to realise this sooner or later in the industry. If the tool is COBOL or something then so be it, and to hell with doing it in the trendy scripting language du jour just because all the cool kids on the internet said so.

1
0
Trollface

Re: As a grown up, I say..

Yeah the language or platform doesn't matter - I can write C in any language

7
0
Silver badge
FAIL

Re: As a programmer, I say..

"Java has built-in concurrency "

Concurrency is the job of the operating system , NOT the language. Java only has it because it usually runs in a VM which is in effect a mini OS which pointlessly duplicates functionality in the actual OS.

2
5
Silver badge
Holmes

Re: As a programmer, I say..

> Can someone tell me why they used Java

Because lousy programmers. Primadonnas who think they can handle C/C++ then shit all over themselves are the worst. But even so anything employable at market rates these days is lousy even with Java.

If programming needs to be ghetto style, I would rather they do it in Java.

2
0
Silver badge

Re: As a programmer, I say..

> Java only has it because it usually runs in a VM which is in effect a mini OS

LOL no. Completely different concepts.

0
2

Concurrency!!

Concurrency is the job of the operating system , NOT the language.

BullC**p!! I mean it seriously!

The concurrency has little to do w/ the OS but mostly w/ the hardware. That means primitives like CAS and memory ordering and fencing. I won't go in details here (since it's not an appropriate place) however OS may provide the thread scheduling and that's all. The lock/mutex, lock-free (which I write myself) datastructure need the aforementioned CAS (or LL/CS) and memory model, OS is irrelevant.

0
1
Silver badge
FAIL

Re: Concurrency!!

"BullC**p!! I mean it seriously!"

Really? So you think threads are the job of the language do you? So what about seperate processes then? No? Why not?

"The concurrency has little to do w/ the OS but mostly w/ the hardware."

Utter utter crap. Go get yourself a clue about how a modern OS works. Yes hardware is required for pre-emptive multitasking but the OS kernel actually does it.

0
0
Silver badge
WTF?

Re: As a programmer, I say..

"LOL no. Completely different concepts."

How wrong your are.

0
0

Re: Concurrency!!

@Boltar,

Obviously you do not grasp the concurrency part - w/o the potential interaction between the different threads the problem would be 'embarrassingly concurrent' (like processing different images) and not really interesting - hence not even considered 'concurrent'.

Now when the threads actually need to communicate the OS is not involved much (aside park/unpark stuff), think of queues, message passing, shared maps, any shared state, etc.

Thread scheduling is really uninteresting part from application point of view [where the language comes] - indeed it plays role in the fact that GC enabled language may starve the CPU and not allow the garbage collector (or compiler) to run in time yet aside that there's nothing interesting. Some schedulers might allow faster context switches and so on but really that's nothing the application can do about.

A simple example, sorting an array: it can be implemented via fork-join and merge sort and the OS has nothing to do beside: managing/scheduling the threads - the real concurrency part comes from the the fork/join stealing queues which lay in the application domain. Also if the active threads stays at the same number as the logical cores of the system the OS practically is not involved.

I believe I am well educated how OS works, there is not need to be offensive.

0
0
Silver badge
Facepalm

Re: Concurrency!!

"Obviously you do not grasp the concurrency part "

I grasp it perfectly well thank you and your attempt to blind me with technobabble failed miserably.

"think of queues, message passing, shared maps, any shared state,"

Who do you think looks after all those? Its either the OS or the VM doing OS-like operations. Clue - queues , semaphores and process stacks are all controlled by the OS.

"lso if the active threads stays at the same number as the logical cores of the system the OS practically is not involved."

Oh rubbish. Do you think those threads have those cores to themselves 100% of the time? What do you think all the other applications on the system are doing meanwhile? The OS is constantly swapping threads and processes in and out of run state.

"I believe I am well educated how OS works,"

Yeah , right. It seems to me you know how a Java VM works and thats about it.

0
0

Re: Concurrency!!

Clue - queues , semaphores and process stacks are all controlled by the OS.

And this is where you fail -- none of them requires OS intervention, all you need park/unpark from the OS unless you wish to do busy waits (it's actually used to active low latency hand-offs).

How you mistake concurrency w/ multitasking/thread scheduling, that was the 'grasp' part about. Thread scheduling (incl. processes) is OS job however that it's given regardless what language you use. Lock/mutex/semaphores/queues (incl lock-free) are application domain, you do not need anything from the OS to impl. 'em - park/unpark for threads is all it takes. All OSes have it.

0
0
Silver badge
Facepalm

Re: Concurrency!!

"And this is where you fail -- none of them requires OS intervention,"

Really? Do explain how multi process queues and semaphores work then. How exactly do processes bypass the OS kernel and commucate to seperate virtual address spaces ? As for multithreading which uses the same address space feel free to explain how atomic operation mutexes work without OS intervention given that threads need to be woken up or put to sleep based on their values.

"How you mistake concurrency w/ multitasking/thread scheduling"

No,. One is simply the implementation of the other. You can't talk about concurrency without mentioning threading or processes.

"Lock/mutex/semaphores/queues (incl lock-free) are application domain,"

Mutexes can't because you can end up with race conditions if you have more than 2 threads waiting on

the same mutex unless you go with a round robin setup which is highly inefficient. You CAN implement the others in the application domain but usually the OS kernel supplies these facilities because it can put threads to sleep and wake them up. The Java VM does it internally because it rolls its own "threads" and as I said in an original post , its simulates a mini VM.

Btw, you do apparently love buzzwords but you might try aquiring some knowledge first. It tends to help in these sorts of arguments.

0
0

Re: Concurrency!!

The communication within the process i.e. threads do share the same address space. I never mentioned interprocess communication - however, again can be implemented w/o the OS (!!) through shared memory, once the memory is allocated/shared the OS doesn't kick in. The approach again is used to low latency process communication, it requires polling and one core per process polling but it's a lot faster than socket on 127.0.0.1 (also a lot more cumbersome to use)

Mutex/Lock is trivial to be implemented w/o any races given CAS alike (which is a CPU instruction[s] and doesn't have anything to do w/ the underlying OS). You do not need any round robin, just queues with waiters and the mentioned park/upark. "Craig, Landin, Hagersten or CLH lock queue" if you need more info, the queue is not required to be use with spin-locks, either. No kernel privileged CPU (or memory) instructions are necessary.

The Java VM does it internally because it rolls its own "threads" and as I said in an original post , its simulates a mini VM.

Baring green threads for solaris 13 years ago, java has always been using native threads. I mean always. Saying otherwise is pure ignorance. Contrarily on popular belief Java VM is not any mini-VM that emulates OS, it maps most of its needed native functionality straight to OS. I can say I know hotspot java source/impl. well.

Btw, you do apparently love buzzwords but you might try aquiring some knowledge first. It tends to help in these sorts of arguments.

This is definitely the most inaccurate description of me ever. I am exactly the opposite of buzz whoring. Just to help you out I'd say I do concurrent code for living.

My last post here - answering baseless flame ain't fun especially when the flames contain misinformation and lack of basic understandings.

0
0
Bronze badge
Boffin

Re: Concurrency!!

Concurrency should be mainly handled by the OS, true; however the language compiler/VM needs to know about concurrency and all the niggling details like threads and CAS to work properly without needless locking or race conditions.

Java has explicit support for non-locking in the java.util.concurrency classes so that you rarely even need to know about the java Thread class which wraps around OS threads. I have migrated a lot of horrible Thread and synchronized code to use the java.util.concurrency classes so that the JVM can handle this nastiness rather than me.

Writing and debugging concurrent code is hard, because you need to let stuff happen asynchronously without contention as much as possible, and only add locking where this can't be avoided.

I'm on the http://www.javaperformancetuning.com/ mailing list so that I keep up to date with what works and what causes pain at various scales, including for heavy duty applications.

0
0
Anonymous Coward

Meh

15K messages in a second? Even if each message were the maximum length, but my reckoning that's only about 16Mbit/s. Why is this impressive? My Raspberry Pi could do that.

0
2
Silver badge

Re: Meh

Please get in contact then, of hallowed AC one. Fame and fortune awaits.

1
0

Re: Meh

During higher market activity we get like 50k messages per second and that's every day and that's per server. Horribly nothing impressive, it's an underwhelming number.

1
1
Anonymous Coward

Re: Meh

During higher market activity we get like 50k messages per second and that's every day and that's per server. Horribly nothing impressive, it's an underwhelming number.

I'm guessing your weedy little trading application doesn't have to redistribute those messages to millions of readers, nor does it have many (if any) distributed caches and cross references to maintain. Your market activity probably consists of a handful of attributes (stock ticker, price, quantity) that are persisted to one database with a few potential triggered messages, all handled over a LAN or high speed WAN. Big deal.

1
1

Re: Meh

Your Raspberry Pi isn't highly available though, is it. Writes to a reliable distributed database unfortunately take a lot longer to complete because you have to write to the quorum of databases rather than just one to be sure that the write has succeeded.

Also I am sure that twitter's read requirements are several orders of magnitude worse than the write requirements.

0
1

Re: Meh

application doesn't have to redistribute those messages to millions of readers,

We do exactly that.

2
0
Silver badge

Re: Meh

We can't all work for Bloomberg stanimir :)

fanatical Ruby developers who believe the language's syntax, its high developer productivity, and its overall philosophy far outweigh any performance disadvantage it might have compared to other languages.

Well, they built it in Ruby, and everyone jumped on board it. You are positing that had they built it in Java from the start, they would have built it just as fast, and people would have jumped on board it just as quick, just as quick.

Ruby is no good for prime time anyway really, it allows too much hackery and punctuation - check out some of why_'s DSLs written in ruby. Write in python, if it is too slow, rewrite in C. C aint hard.

0
1
Anonymous Coward

Re: Meh

We can't all work for Bloomberg stanimir :)

Well if he does work there, then he's coding in C using the glib library - at least that's what they used last time I heard about the details of their high volume systems.

0
0
Silver badge

Next time we get one of those really predictable Java-bashing threads, I'm going to point people here. </smug_mode>

2
0
Anonymous Coward

@Greg J Preece

Why, so they can laugh at college grads?

0
0
Bronze badge

I'm confused why they have such problems with performance. Facebook has issues because people are posting massive quantities of data and interlinking it in complex ways.

Twitter is 140 bytes per tweet maximum with no pictures and each tweet can only be retweeted so not overly complex. As AC said above, 15k messages is only 2MB of data a second, or 16Mb bandwidth. Although in theory they may need 15k IOPS to store those messages, that's hardly a lot for a global solution, and using SSD with SATA for longer term (old tweets are rarely read back) wouldn't cost much at all.

0
0
Thumb Up

Listen to @Lusty, the man speaks the truth, 15k messages/second might be an impressing number in mid 90s but not now.

1
0

it's not that easy

when people tweet that tweet goes out to each person who follows that tweet. The there is also the hashtags. Think about that for a minute.

0
0

Page:

This topic is closed for new posts.