Allow me to make one thing perfectly clear: If you insert those symbols into my perfectly working website… only to mess with me and inadvertently give me vietnam-style flashbacks to the days when I had to deal with incredibly badly formed and misencoded CSV-files on the daily…
Then I will find you and break into your home to replace every second sock with one of the same color and pattern but slightly different make, size or material and you will always wonder why you can’t find any exactly fitting pairs of socks anymore.
I willingly embraced mismatched socks years ago. I just pretend it’s a fashion statement. Come at me bro.
ooh, so it was you then!! 😠
To the op of the screenshot meme, calm down satan
Usually only happened when a French person copied and pasted their text directly from a Word document… dang weird spaces and accented characters… drove my boss mad when I told him it was because it French, and not a glitch.
Still had to work around it… text counters in textboxes had to account for accented characters, which took two bytes instead of one.
“I only have 2000 letters!” … 2000 including 200 accent characters made it 2200 characters, not 2000.
Easy. Just use utf-32 and make the text field a maximum of 500 letters. That will be a maximum of 2000 bytes, doesn’t matter if the user is french or Chinese.
“I only have 2000 letters!” … 2000 including 200 accent characters made it 2200 characters, not 2000.
Or, you could count it in Unicode characters, and not in whatever bizarro charset you’re using over there. Then “À” is one character, just as it’s supposed to be.
The problem typically comes from improper conversion between charsets. Like Windows-1252 to Unicode, or something equally horrible.
Not if the maximum is due to the database being configured to have a maximum space of 2000 bytes for that field.
Ah, well, bytes ≠ characters, but yes, that can certainly be an issue.
Allow me to introduce you to my favorite Unicode character, the zero width space
Unfortunately, evil people blacklist this character a lot :(
May I introduce you to my favorite Unicode character, the Braille zero dots
What kind of devil came up with this?
The justification is on the linked page
that sounds awesome! (there’s 10k zero width spaces between the quotes ->‘’.)
Odd, on the Connect app it shows a bunch of spaces, but not 10k of them.

thats a 29kb comment right there
Hey that’s a neat tool. Link?
I’ll be damned. The crazy son of a bitch did it!
Checks out:

On Interstellar it shows up as normal spaces for me so there’s just a giant block of empty space


My day has come!!!
Don’t sanitise inputs. Reject non-conforming inputs entirely.
But otherwise: yes.
No fuck both of those, just use prepared statements so user input can’t be interpreted as SQL.
And end up having loads of valid requests rejected 😁
If they were valid they wouldn’t be rejected.
Well then someone with a Tagalog name gets caught in your filter…
I mean if it’s “perfect” they yes, it’ll work, but in production…
Also, you sometimes want to be able to store “1); Drop table abc;” in your database, I mean how do you otherwise store this comment right here? Sanitizing.
That’s conforming (to what ever criteria). Send me a UTF-16 string of at most 100 code points. Send me a 7-bit ASCII string of only A-Z0-9. Reject anything that doesn’t comform.
sanitizing is trying to clean an input. That’s “lemme just double escape some special characters” or stripping/replacing/encoding characters or truncating strings, coercing types. Didn’t do this, your sanitization code will have bugs or edge cases.
I agree with everything in your comment except the last word. Only sanitize in cases where there isn’t a better option like html or terminal escape sequences. SQL had prepared statements, which are better.
I’ll never conform!
Righteous.
How would you do this in C? I’m a beginner. Does it entail checking/disallowing certain characters and data types? What? 😃
If you use the SQLite C API like this
char query[256]; snprintf(query, sizeof(query), "SELECT * FROM users WHERE username = '%s'", username); int rc = sqlite3_exec(db, query, NULL, NULL, &err_msg);and someone enters
Robert'; DROP Table Students;--as username, it deletes the table Students.const char *sql = "SELECT * FROM users WHERE username = ?"; int rc = sqlite3_prepare_v2(db, sql, -1, &stmt, NULL); if (rc != SQLITE_OK) { fprintf(stderr, "Failed to prepare statement\n"); return; } sqlite3_bind_text(stmt, 1, username, -1, SQLITE_STATIC);Using this “prepared statement” and “bind”, your code is secured against such SQL injection attacks.
How do you sanitize your inputs or how do you exploit inputs which are not sanitized.
Santize inputs.
I’ll get back to you on exploits when I can write something that throws zero compilation errors. 😈
Couple big things are 1. Only accept reasonable characters, on a white list instead of rejecting bad characters based on a black list. This will mean you are less likely to forget to block /0 for example. 2. Understand how strings work and ensure both reading and writing to that string doesn’t extend beyond the end of memory allocated for the string. For example do you understand what the /0 would do to a string your program accepts?
Sic! Thanks! I’ll work on this this weekend! 😊
Keep in mind, the lowercase and uppercase letters are in continuous blocks on the ASCII table so you can can use that to verify if a char is a letter without doing an incredible long chain of if else statements.
Many languages like C, Java, Python, etc allow you to construct SQL queries or SQL statements, where SQL is its own language used to communicate with a database, like Oracle or MySql, or Postgres or MSSQL. One way to do this is to construct a string in your language using whatever string functions, concatenation etc available in your language. The problem occurs because usually you want some kind of user input as one of the parameters in your sql query, in order to fetch the correct records the user is asking for. Like say a record ID or name. If you do not properly sanitize that ID or name which originally comes from some type of user input, then a malicious user could carefully craft an ID or name which includes their own SQL and other special characters, which will interfere with the query you intended to construct, and instead do something malicious. Like delete records or obtain records the user is not supposed to have access to.
There are many ways to guard against this, and you should learn about this when you start working with SQL and databases. It’s called a SQL injection.
There is another type of code injection which can occur if you are making exec() calls (or whatever your language uses) to run shell commands. Similar caution should be taken there.
I know what I’m dealing with when I see a query that isn’t using a prepared statement.
I mean a prepared statement is still created with a string.
But you definitely want to be using bind parameters with your prepared statements. Not only for security but also potentially performance improvements.
You wouldn’t - what they’re describing is called “SQL injection” - a way to fool poorly written web server code (regardless of what language it’s writen in) into executing SQL code. The poorly written server code takes what’s entered in a form field on a web page and pastes it into a skeleton of a SQL statement - in this case the text in the input field is SQL that ends the intended statement, followed by a new statement that deletes a table. For this to even work, the SQL skeleton on the server would have to be structured in just the right way so the modified version with the pasted-in text still makes sense. For this reason, hackers attempting SQL injection usually have to do a lot of trial and error to get something to happen. The only way it can work at all is if the server software handling the web page sends SQL commands to a database server as text, as if they’re being typed in, and the server executes them. You can’t inject C in this way because unlike SQL, C code isn’t just executed, C programs have to be precompiled.
I � Unicode!
I like this very much! It implies that the person expressing this knows exactly how they feel about Unicode. It’s just us, the readers (or some other link in the chain), who have/ has the wrong encoding.
Unless they work for Microsoft. Teams has been showing � instead of ä for the caller’s name in the popup when someone calls for several weeks now. It didn’t use to do that before. I don’t think they care anymore.
I don’t think it’s even “they” any more.
I remember feeling extra powerful when Moonshell for the DS shipped with UTF-8 and UTF-16 support because the developer was japenese and wanted to make sure any language would work.
I see little Bobby Tables is all grown up
He’s taking painting classes. Let’s see if the database there can handle him!
Former dev here, can confirm on occasion it does.
[object Object]
NaN
Yes?
What? My mother was a saint!
How to make your code look ‘modern’ 101
□□□□□□□□ !!



















