Dick Jones would like this
AI trying to detect what is, and isn't, a gun?
The first thing that comes to mind is ED-209...
"You have 20 seconds to comply."
Artificial intelligence has the potential to take over mundane, boring tasks such as driving, scheduling meetings and transcribing speech. Now there's another job that can be added to the list: detecting handguns in videos. As the technology improves, it won't be long before police officers or security guards can jump straight …
Because these cops couldn't:
https://www.engadget.com/2009/03/04/man-holds-woman-hostage-for-10-hours-with-a-sega-light-gun/
Well as a Yank...
1) Cleveland Ohio. There was a 13yr old who was big for his age, looking like he was 17. He was also mentally challenged. He was sitting on a swing with an airsoft pistol where the orange plastic part of the barrel was broken off. This gun, looked real.
Someone called in to the police and reported him. The person told the 911 operator that the gun was probably fake.
The police rolled up on him and the kid reached for the gun. The younger officer shot and killed him.
2) In Chicago, we have gang bangers who are as young as 13 or 14 years old carrying real pistols.
3) We also have kids carrying airsoft guns and show them to people on the street in an attempt to rob them. (While IL passed a CCW law that took effect Jan 1, 2015, most do not own guns and fewer have CCW permits which is on the rise) So, how do you tell a real gun from a fake gun that was made to look real?
Every morning, I open up the Trib site to see how many shot and killed over night. (Yes, its that bad in some neighborhoods.) The issue isn't guns but the gang violence over turf. Due to the recent lawsuits and BLM protests, the police are less likely to be aggressive because they want to keep their jobs and don't want to face a civil suit. The gang bangers are more afraid of their higher ups than of the police and there's even video of them taunting the police who were working a crime scene. (Including firing off a gun in an alley a block away)
But back to your point... you can make a homemade gun that fires a .410 shotgun shell using stuff you can buy at a hardware store. (Google zip gun) And it won't look like a gun.
In terms of police hostage situations. Police have killed attackers who had small knives and were high on drugs (L. McDonald [Chicago]) or had a base ball bat and was walking towards the police.
The point is that when you have a hostage situation, even if he's not armed but is in a position to cause harm to the hostage, you will have the police in a situation where they may need to use deadly force. Most times, they'll wait it out.
is definitely pointing a gun at the camera.
Good to know.
Let's say I'm doubtful any of the others will do much better. BTW 144 million parameters with 64 bit parameters that's a bit over 1GB of data to (potentially) update per frame.
Now what is the actual quality of the real CCTV these things are meant to be looking through?
There's a scene in one of the Bond books (Man with the Golden Gun?) when he's going through an automated firing range to test his skills and Fleming comments that the range is normally lit to be "Averagely bad" as IRL things are rarely perfectly lit so you can identify who's holding what
Half that gig. The "parameters" would probably be 32-bit floats, but there are ways to cut that down, too.
The "parameters" are generally multipliers stashed in a GPU's memory and divvied up in parallel to however many computation units the GPU has (in the 1 to 3 thousand range nowadays for a single, good GPU card).
Probably. I just did it as a quick BOTE with 144 being a bit more than 1/8 of 1024 and worst case assumptions. It's not that big a number and there's that new short reals architecture that British company (XMOS?) is working on to cut it down further.
These sorts of problems are usually described as matrices but I wonder what the update pattern is like across 16 layers.
My instinct is not every weight gets updated on every pass so just cycling through every element in the matrix would be very wasteful, as a lot of the time it would A[X] = A[X] x 1. A smarter tracking of what cells really need to be updated could pay big dividends, although I'd be surprised if they haven't already thought of this. The joker would be if the update pattern shifts too frequently to make optimizing it worthwhile over just cycling through all cells.
Actually less.
The trick is to ignore the background and to focus on the people and their immediate surrounding.
This reduces the amount of image that you have to process and then you can isolate on the hand.
So first frame, you have the most work to find the people, search for hands and then once isolated, you can track the hands to see if they are holding something in the subsequent frames.
To your point, yes, in a video or movie, the lighting is usually done to make the gun stand out.
But suppose we took a video of a group of men standing and wearing dark clothing. Then one pulls out a gun, while another pulls out a cell phone, both black and the phone's screen is dark. They just pull it out and have it at the hip. Do this again with the people wearing dark gloves. Then again in dimmer light.
You'd be surprised at the results. (maybe not.) I guess the resolution would also matter.
BTW, on the James Bond Reference.... would the system recognize the gun that was made out of a cigarette holder, and the cigarette case and lighter? (What if he was palming the gun with only the cigarette holder showing? Would a pencil held like that also trigger the system?)
And of course, the gun Daniel Craig is holding is most likely a Sig Sauer P226 but could be a P220 (You can make out the de-cocking lever and its a full sized frame) It also looks to be an older model since its missing the front rail. [Yes, I own and shoot Sigs. ;-) ]
Is it rAIcist? Or does this reflect a bias in the training routine?
Do not think so - the guns in the background are not in focus and fairly low res. There is no way to get a high confidence rating on them using an image recognition algo. We know and understand it is a gun based on character behavior. If you take the frame from that movie (easy to do - all of us have a copy) at DVD SD standard res there are not enough pixels to work with.
Now the the higher confidence ratio on the police squad photo does look fishy. Interesting what data did they feed this.
No silly, they don't need a a computer to detect people with sticks. They have meat sacks who can do that already.
I'm American. You're all just foreigners to us, you're lucky if we distinguish between the tea-drinking-foreigners-with-funny-accents and the talk-like-mexicans-but-don't-make-tacos foreigners, especially since neither type is allowed to own guns. Probably because you're all commies.
(I jest, of course - I read the article too quickly. Except for the gun part, it's a shame you don't get those, they're wicked fun.)
So if it thinks it sees 10 guns in 24 hours of video but only one is real, it only wastes about a minute or two of your time showing you sections of the video where there aren't guns.
It might have a problem in those states where open carry is allowed - it will see guns all the time, because idiots who think it is the old west will walk around with a holster on their hip, just because.
Artificial intelligence has the potential to take over mundane, boring tasks such as ...scheduling meetings and transcribing speech
I'll buy that for a dollar - I can send it to my company meetings while I get on with some useful work. FFS.
And as regards cameras, I can heartily recommend the Zenit Photosniper. No, really officer, it's a camera!
http://camerapedia.wikia.com/wiki/Zenit_Photosniper