lunes, 7 de agosto de 2017

Some thoughts about gender equality

After reading the “Google’s Ideological Echo Chamber” I wasn’t expecting any sort of reaction from anybody, I thought, well, just another spoiled kid trying to catch some attention, but it seems that at my 32 years I still know too little about how other people think. Some people decided to take positions around this absurd document based on weak suppositions coming from someone, that I suppose had almost not contact with women along his entire life, and that in an herculean effort to comprehend these mysterious beings made some assumptions based on what he learnt form movies, music and so on…

As the younger brother of four sisters and son of the most amazing person I ever meet I think that I’m in a position to put some tings in perspective for those that the only woman they got to know was their mother and not too well for what I rode.

My mother was born after the Spanish civil war in “San Roman el Antiguo” a tiny village in the middle of nowhere extinct decades ago, my grand fathers were farmers. When my mother was a kid, I’m not sure how old she was, she lost her father, but even under a fascist dictatorship and against my grandmothers wishes, she pursed a degree in Chemistry. After graduate with honours and not having enough with this degree, she continued pursuing a PhD.

When she was twenty-something, she was diagnosed with ovarian cancer, according to the doctors she was going to be unable to have kids, but it seems that my three sisters and me didn’t agree on that…

In the time she was growing four kids with almost no help from my father and working at the same time, she decided that it may be a good idea to get another degree this time in pharmacy. Again, she got the degree with exceptional qualifications.

She bought an Apple II as soon as it arrive to Spain, she learnt to code and was the one who teach me when I was five, to me and to my sisters. I still have “El Grosero”, “the rude” in english, the first “game” I coded together with my sisters, it was about fart jokes and stuff like that, best game ever :)

I really think that she is the most extraordinary person I ever meet, but she didn’t keep her curiosity and will to learn for herself, as I mentioned before I have three sisters. I’ll not bother you with our histories but just to put some thing in context:

My older sister, she is a doctor with two specialities, she holds a PhD and is specialised in hormonal disorders. She speaks six languages and had been living all around the world working for the most prestigious hospitals and research centres on every country (she also plays the piano, write books and so on…)
Other of my sisters holds two degrees, one in architecture and another in tech architecture. She started a company from scratch in the middle of one of the worst crisis in the recent history of Spain. Nowadays her company is a very successful one that employs several people
My younger sister studied arts and holds a masters degree in fashion design. After she came back to Spain from living abroad some years, she started an online company that imports articles form Asia. It is one of the most successful online stores of its type, she is making a lot more than me while working from home and having time for her daughter

I hold a master’s degree on computer science that I finished thanks to my mother. I had been working for several companies in the latest years living in many different countries and meeting awesome people. Thanks to my mother I’m who I am right now, she influenced the lives of all the people who had the lucky to meet her.

As you may understand, I don’t believe in gender, I don’t like the “woman who…” programs that are promoted by Google or any other company, I find sad that these programs are still necessary. It would be great if the people stops trying to classify each other in what is obvious, but some people are just not smart enough to understand complex concepts so they have to relay on race, colour gender and things like that.

If you still believes that the gender can determine what you can or can’t do in this live, just stop and think in what would you be doing if you were born almost seventy years ago in the middle of nowhere with almost no resources as my mother was…

Note: Sorry about the broken english but I’m kinda drunk and I’m not going to ask to my older sister to correct this :(

miércoles, 5 de abril de 2017

Some weird behaviour in the Google endpoints - Possible DoS by application-layer flooding?

More than a month ago I reported to Google a weird behaviour that I detected while uploading a picture to Google Docs. I was working on a script to upload pictures and attach them to documents when I made a mistake while sending the document ID and the API kept the request waiting for about 3 minutes to end up sending back an internal server error after a weird four-way handshake.

This behaviour made me think that Google Docs would be trying to find the document and since it was unable to find it in any sort of recently used documents, the request was forwarded to another subsystem for long term storage. Another possible option was that since the document was not in the DB index it was performing a full-scan.
After try some other requests I found other two endpoints with a similar behaviour where it is more easy to perform the possible DoS attack since these doesn't require to upload pictures or to send multiple requests. These endpoints keep the client waiting for 4 mins instead of 3.

The endpoints are:

https://docs.google.com/document/d/invalid_docu_id/edit

https://android.clients.google.com/auth : For this one you have to perform a valid POST request but using an invalid user e-mail

And this is what happens when you use ApacheBenchmark to test the docs.google.com URL:

As you can see from the ab results, on this case all the 10 requests are taking 4 mins.

Other times some requests are resolved in ms, this was a bit disconcerting to me, making me think that this could be caused by any sort of rate limiter. In order to verify that this was not the case I run the same request from a server that had never execute a request like this being the result the same 4 mins as before. After this, I think that the most plausible explanation is that there is a crash report system or any other sort of logs collector trying to get information from the document that ends up timing out.

The next is what we can see using WireShark:

The transmission abobe is as next:

The server starts answering but it get stuck
The client sends a TCP Window Update since the transmission is not complete
The client sends every minute a TCP Keep-Alive and the server answers to this keep-alive packages, this means that the server is actually listening and has the socket open
The server sends the FIN, ACK in order to execute the four-ways handshake, I guess that after an internal timeout
The client answers with an ACK as expected during the handshake
The client answers with an “Encripted Alert”, this is the client requesting the termination of the TCP secured connection and because of the payload was not completely sent
The client sends the FIN, ACK as expected
The server instead of the ACK, sends a RST now dropping the connection without shut it down gracefully

This kind of behaviour facilitates (a lot) a DoS by application-layer flooding. This means that, if you start performing requests to this endpoints, you could end-up causing the collapse of the system.

The collapse could be produced by:

The reach of the max open sockets limit. Since the servers keep answering to the keep-alive but not verifying that the customer has the socket still open you could just send requests and don't wait for any answer causing the servers to start rejecting incoming connections, this will also cause other resources consumption like memory, CPU and so. You have 4 mins to send as many requests as you can.
In case of this request being causing memory, I/O or CPU pressure in any of the shards, you can just send several requests using different IDs in order to flood all the shards, this would make the systems collapse causing the DoS

Since you have 4 mins to cause one of the above described situations, that facilitates a lot a possible DoS attack, being able to be launched from a single computer using a commodity network. The attack could also be performed by mistake.

Google doesn't consider this as a security issue:

I understand that a generic DDoS attack shouldn't be considered a security issue, since there is too little you can do to prevent a Botnet from attacking your systems. But on this specific case it is like leave yours home door completely open because you can't do anything to prevent people from breaking in.

From my point of view, this is a security issue, I answered with the e-mail below explaining the reasons why I think so:

They answered with the next e-mail:

But after some weeks I received a generic e-mail discarding this issue as security issue, the issue is still reproducible.

viernes, 17 de marzo de 2017

Some Gmail endpoints are storing e-mails in the browser cache. Google doesn't consider this a security issue

It is well known that if you serve private information, like e-mails, you have to specify the no-store value inside the cache-control header so the browser doesn't persists this information in the cache, that would be insecure since after the user closes the session the data will still be there.

According to the W3C: The purpose of the no-store directive is to prevent the inadvertent release or retention of sensitive information. ( https://www.w3.org/Protocols/rfc2616/rfc2616-sec14.html#sec14.9.2 ) Google is sending this value on most of the responses that contains sensible data as you can see from the screenshot below:

The problem is that some old Gmail endpoints are not including this value, like for instance:

https://mail.google.com/mail/u/0/feed/atom

The mentioned endpoint is the one used by the official Gmail ChromeExtension:

https://chrome.google.com/webstore/detail/google-mail-checker/mihcahmgecmbnbcchbopgniflfhgnkff?hl=en

So every time this endpoint is acceded by your browser it stores the result in the cache, storing all your unread e-mail in a shared space including several private information:

I reported this about three weeks ago, sending a detailed description and the next video:

Nobody visualised the video before I got a the next answer from Google:

So, since in order to explode this security flaw it is required access to the computer Google doesn't care about your security, you must use your own computer :(

In the real world the people share computers and they also use public ones. After a user closes the session, the expected result is that all the user data should be safe when the used computer is not compromised at all, I wasn't talking about install any sort of malware, keyloger or whatever, just the official Gmail extension.

This is a super easy bug to fix, but the no-store is still no present. All the other e-mail companies are sending the no-store value, for instance WorkMail, FastMail and so.

The other day while I was waiting at the airport to take a flight I went to print my tickets using a shared computer, I had to connect to Gmail in order to download the tickets. I decided to install the official Gmail ChromeExtension (perhaps Google is distributing malware).
When I came back I made a video where I was able to access the private e-mail of more than twenty accounts.

I will not publish this video since I prefer to don't get in trouble, but I really think that this is an important security issue, easy to fix that should be addressed as soon as possible.

sábado, 8 de junio de 2013

Not all the Redis values are strings...

Working with a Redis database we discovered a very strange behavior after a refactor on one of the services.
Before the refactor we have been storing on Redis sets with only integer values, but we decided to add some information as a suffix of a couple of chars, something like: 12312412 -> 12312412:b after this refactor, the memory usage of the instances of the cluster, was increased from ~2GB to ~10GB, this was something unexpected for us, according to the Redis documentation, the sets only store "binary-safe strings" values: "Redis data types"
And after check how Redis handles the strings (Hacking Strings), the relative memory increase should to be directly proportional to the increase of characters of the values. According to the increase of the values lengths after the refactor we expected an increase of the 20% but it was of the 500% :( .

I decided to do a little proof of code:

Well, as you can see, we test two kind of values, the integer sets will contain numbers from 10000 to 10030 with a zero at the end, then the result will be integers: 100000, 100010, 100020, ..., 100300 , and the string sets will contain the same number but with the zero before the number, this will force to Redis to store this values as strings: 010000, 010001, 010002, ..., 010030 .
The expected behavior, if Redis stores the values as strings, is that after create all the sets, the memory usage will be the same for both test cases... but:

This unexpected behavior is caused by the next lines of the Redis source code: t_set.c
Redis checks if the value can be represented as "long long", and in that case stores the value as its integer representation, a clever behavior, but I didn't find it documented.

lunes, 11 de marzo de 2013

New Spotify Puzzles - Reversed Binary, Zipf's song, Cat vs. Dog - solutions

I discovered some months ago that Spotify has published some new puzzles. I love the puzzles, and was impossible to me resist the tentation to solve all of them :)

Reversed Binary Numbers

https://www.spotify.com/es/jobs/tech/reversed-binary/

Solution:
- https://github.com/alonsovidales/spotify-puzzles-v2/blob/master/reversed_binary/reversed_binary.py

This is the simplest problem, I used less than ten minutes to solve it.
The steps to solve it are:

Get the string representation of the number in binary using the bin() Python function
Remove the first two characters, that Python adds to identify the string as a binary representation using the [:] notation
Using [::-1] Python will reverse the array (-1 is the step)
Convert the reversed array to an integer again using the int() call and specifying as second parameter the base 2

Zipf's song

http://www.spotify.com/es/jobs/tech/zipfsong/
Solution:
- https://github.com/alonsovidales/spotify-puzzles-v2/blob/master/zipfsong/zipfsong.py

This puzzle was a little more complicated, not for the problem, is only to apply a simple formula, was because the team who did the puzzles made a little mistake and the platform always was returning a "Run Time Error" each time that I tried to send the answer, I wasted around two hours trying to found the error on my code, and when all the hope was left, one of the persons in charge of the puzzles sent me an e-mail:

Was really nice discover that was a concerned team behind the puzzles, and that I was not in a mistake :)

The steps to solve this puzzle are:

Calculate the quality of each song applying the formula: number of times plus the song position
Sort the array of tracks by quality
Split the array to get only the necessary song names
Return the list of tracks as a string

Cat vs. Dog

https://www.spotify.com/es/jobs/tech/catvsdog/
Solution:
- https://github.com/alonsovidales/spotify-puzzles-v2/blob/master/catvsdog/catvsdog.py

This is the most complicated of the three problems, but if you remember the old puzzle "Bilateral":
- http://alonso-vidales.blogspot.com.es/2012/03/spotify-bilateral-projects-puzzle.html
is much more simple :)
Reading the puzzle was easy to discover some clues:

Also, based on the universal fact that everyone is either a cat lover (i.e. a dog hater) or a dog lover (i.e. a cat hater), it has been decided that each vote must name exactly one cat and exactly one dog. -> Is a bipartite graph
procedure which guarantees that as many viewers as possible will continue watching the show -> Is a maximun cardinality problem :)

Then we have a bipartite graph:

- http://en.wikipedia.org/wiki/Bipartite_graph

Now we have a reduced number of algorithms to apply with a small complexity.

We have two options, study the compatible or the incompatible graphs of voters. If we build the bipartite graph of the incompatible voters, we can obtain the maximun matching on this graph, and a full coverage of all the vertex.

The minimun number of edges who cover all the vertexes of the incompatibles graph are equal to the minimun number of voters to be discarded in order to obtain the maximun number of happy voters :)

To obtain this we found an old friend, one of the most beautiful algorithms, the Hopcroft-Karp algorithm (the same to be used in order to solve the old puzzle):

- http://en.wikipedia.org/wiki/Hopcroft%E2%80%93Karp_algorithm

The steps to solve the problem are:

Create the bipartite graph of incompatible voters, I used sets to do it as fast as possible
Apply the Hopcroft-Karp algorith to obtain the min number of edges who cover all the graph
Subtract the number of edges who cover the graph of the total number of voters, this will give us the max number of happy voters that we can obtain :)

Thanks Spotify for this puzzles :)

lunes, 4 de febrero de 2013

Spoofed links on Facebook

This bug allows to share links on Facebook that will seems absolutely legit, but linking to a different location than the location who Facebook show.
The problem is the flow who Facebook uses when a user share a link:

Connect with the linked site
Download the site information
Check that all is correct, sanitize all in order to ensure that the user don't send something that can harm to the other users, etc
Put all the information in a form that the user will send to publish the link
And, this is the problem, trust on the information that the user send on this form

All the developers knows that is dangerous to trust on the information that the users send, If the user want, him can send a spoofed form altering the information contained into the form values.

I reported this bug the past 30 of december (more than a month ago), but I didn't receive any response, and they didn't watch or download the demonstration video (I uploaded the video to one of my servers, and I can't find any access on the access_log file to the video):

jueves, 6 de diciembre de 2012

Non persistence XSS on the Spanish Senate site

The last week, when I was visiting the senator web site, I discovered something stupid, when you add some GET params to the URL this params are included directly without escape on the links used to share the current page on the different social networks (Facebook, Tuenti and Twitter).
On the next example I add a new attribute with name "aaabbb" and value "hello" to the social links:

Ok, I can add attributes to the links, now the problem is that if I try to use any of this characters: ', <, >, (, ) , the page redirect us to a 404 page:

Well, we have a problem, we can insert JavaScript code inside the attributes, but we can't execute anything without the ( or ) characters... or not...

If insert an attribute, for example onmouse over, you can insert assignations, for example:

var aux1 = this.parentNode.parentNode.innerHTML;

We didn't use any of the forbidden characters, and we have a very interesting string like:

Ok, what can we do with variable assignations and this string on JavaScript to execute code out of the jail....

We can for example do the next:

this.innerHTML = "<img onload=\"code_to_execute();\">";

But... as you can see I'm using the <, (, ) and " characters and I can't use this characters on the URL. Then, for create the "<img..." string I can use the string that I get on the aux1 assignation.

For example for the < character I can do aux1[0], that concatenated with the aux1[2] will give me the sting "<i"and I don't need to use eny of the forbidden characters. Then doing:

this.innerHTML = aux1[0] + aux1[2] ....

I can obtain any string to be used as innerHTML, then I'm out of the jail :)

Is really tedious try to create complex strings concatenating characters one by one, to do this easy I did a very simple PHP script:

Inside the "buildInjection" call parameter you can put the code what you want. For this example, I use the system to using jQuery load a Script from a external server, then you can include more complex code on this script.

The code that I inject modify the art gallery by another more sophisticated :) :

In resume, yes, we don't have a permanent XSS, but, we have the control of the site using the social links, then we can, for example, share this URL instead of the correct one by the social networks as a "XSS Worm":

http://en.wikipedia.org/wiki/XSS_worm

I reported this bug a week ago to the official senate twitter account, and by e-mail but I didn't receive any answer:

https://twitter.com/alonsovidales/status/274329835646615552

A video that show all the process: