Thursday, March 29, 2018

Fixing xref-find-references

I was annoyed that xref-find-references searched more than the files in the TAGS file (it seems to search everything below the top level directory of the TAGS file?) so I went looking for a fix. I found that apparently my emacs knowledge is out of date (the fact that it's xref now and not tags was my first clue). I couldn't find any way to customize xref-find-references. Instead I found people referring to project, ivy, helm, confirm, The Silver Searcher (ag), ripgrep (rg), and dumb jump. I didn't go all the way and get into project or ivy or any of the others, but I did download ag and rg and tried them from a command-line outside of emacs and saw exactly what I was expecting xref-find-references to do. I figured all I needed was to replace xref-find-references with one of those. I got ag.el installed and working before any of the ripgrep packages (there's both rg.el and ripgrep.el) and then struggled to remap M-? to call ag-project instead of xref-find-references. The thing that finally worked was remapping commands. Here's the magic:

(define-key global-map [remap xref-find-references] 'ag-project)

And actually, to work completely like xref-find-references I added one option to the ag command, --word-regexp, like so (oh, I also removed --stats which is there by default in ag.el):

(setq ag-arguments (list "--word-regexp" "--smart-case"))

Much better. Are all those other packages worth digging into? I'm not particularly unhappy with ido.

Thursday, March 22, 2018

It Is Time To Replace Passwords With Keys

It is time to stop using passwords. The Troy Hunt article, Passwords Evolved: Authentication Guidance for the Modern Era just got passed around the office again and wow, what a mess we are in. Instead of passwords we should switch to using public-key cryptography like TLS and ssh use.

"But people can barely manage passwords, there's no way they can manage keys!"

Wrong. People can't barely manage passwords, they can't manage them at all. Read the Troy Hunt article where he says, "Embrace Password Managers." We've given up on humans managing passwords and we are now relying on software to do it. We use software that securely stores our passwords and synchronizes them over the internet so that we have those passwords on all our devices. Often the software generates the passwords for us too. Guess what we could do with keys? The exact same thing.

Now read the Troy Hunt article again and this time think about how it's not just end-users that can't manage passwords, but how developers also cannot manage passwords. There are UI problems, there are concerns with code injection attacks, there are problems with hashing, salting, and storing passwords and with protecting those stored passwords from thieves. Those problems all go away if developers only have to store public keys. No hashing, no salting, no secrets. Think about it. As a user the only thing you'll give to a website to authenticate is your public key. You don't have to trust them with any secrets at all. From the point of view of a developer, you don't have to worry about keeping your customers secrets safe anymore. What a relief! As for UI, if we do it right websites and apps don't need any UI at all for authentication. What is the UI for TLS? The lock symbol that the browser displays. That's it! Websites you visit authenticate themselves to you using strong public-key cryptography behind the scenes, under the covers. It could be the same with user authentication.

Now read the article one more time and think about how not only are users and developers unable to deal with passwords, but security experts can't either. They can't agree on what makes a good secure password, what format it should be in. Special characters? Random strings of characters? Long passphrases of regular words? What's easier to create? Easier to type? Easier to memorize? Should a website show the password as users type it or not? How often should we change passwords? If we are giving up on memorizing and using password managers, does any of that matter? Maybe?

Now think about public-key cryptography. Security experts generally all agree on what makes good public-key cryptography, what format the keys should be in and what length. That was all hashed out years ago. True, there might be disagreement on when stronger keys should be used, whether to use RSA or ECC, and if ECC which curves to use and so on, but regular users relying on key manager software don't have to be involved in those discussions. They don't have to worry anymore about whether they should use a special character or how long of a password to use or if passwords made up of song lyrics are a good idea or not. The discussions on key size or which ECC curve to use raises the debate way beyond trying to figure out what is user friendly but still defeats rainbow tables and script kiddies. It takes the debate up to the level of wondering which nation state might attack you. If we eliminate human involvement in coming up with authentication tokens and remove script kiddies from the attack surface altogether that's a *huge* improvement over passwords.

"OK, but if we use public-key cryptography we also need to use full PKI like TLS."

Do we? PKI provides identity confirmation and key revocation. Do we have identity confirmation for account passwords today? When I create an account with, say, Amazon do they verify that I really am who I say I am? Nope. Not at all. They don't care one bit about who I really am. They just want my money. What about key revocation? Do we have password revocation today? Again, other than manually logging into a website and changing your password, we don't, and we can easily duplicate that manual process with keys in order to get us started. If we don't have full PKI with our public-key authentication we are no worse off than we are today.

The great thing about switching to public-key cryptography is that someday we could add in some sort of Let's Encrypt-like easy-to-use PKI if we want, which would take us light-years beyond where we are with passwords today. We aren't going to get there though if we don't take the first step of ditching passwords.

Getting rid of password authentication and using public-key cryptography instead will make user authentication easier for users, developers, and security experts, and it will make us all more secure.

Friday, September 15, 2017

Not Leaky, Just Wrong

Intel recently announced new tools for FPGA design. I should probably try to understand OpenCL better before bagging on it, but when I read, "[OpenCL] allows users to abstract away hardware-specific development and use a higher-level software development flow." I cringe. I don't think that's how we get to a productive, higher-level of abstraction in FPGA design. When you look at the progress of software from low-level detailed design to high-level abstract design you see assembly to C to Java to Python (to pick one line of progression among many). The thing that happened every time a new higher-level language gained traction is people recognized patterns that developers were using over and over in one language and made language features in a new language that made those patterns one-liners to implement.

Examples of design patterns turning into language features are, in assembly people developed the patterns of function calls: push arguments onto the stack, save the program counter, jump to the code the implements the function, the function code pops arguments off the stack, does it's thing, then jumps back to the the code that called it. In C the tedium of all that was abstracted away by the language providing you with syntax to define a function, pass it arguments, and just call return at the end. In C people then started developing patterns of structs containing data and function pointers for operating on that data which turned into classes and objects in Java. Java also abstracted away memory management with a garbage collector. Patterns in Java (Visitor, State, etc.) are no longer needed in Python because of features in that language (related discussion here).

This is the path that makes most sense to me for logic design as well. Right now in RTL Verilog people use patterns like registers (always block that activates on posedge clk, has reset, inputs, outputs, etc.), state machines (case statement and state registers, next_state logic...), interfaces (SV actually attempted to add syntax for this), and so on. It seems like the next step in raising the abstraction level is to have a language with those sorts of constructs built-in. Then let people use that for a while and see what new patterns develop and encapsulate those patterns in new language features. Maybe OpenCL does this? I kind of doubt it if it's a "software development flow." It's probably still abstracting away CPU instructions.

Wednesday, May 24, 2017

Facebook Should Split In Two

Facebook has done wonders to get people creating and consuming content on the internet. However, Facebook has grown to the point where it has no competition and is no longer innovating in ways that benefit us. Facebook should split into Facebook the aggregator and Facebook the content hoster.  You could talk about a third piece that is Facebook the content provider, which is for providing things like gifs, templates, memes, emoji, games, and other stuff like that.  Because Facebook hasn't completely broken from open web standards those types of content providers already exist today.

Aggregators would be where you go to set up your friend list and see your feed.  It could look and feel like Facebook does now.  It would have an open standard protocol that content hosters would use if they wanted to be aggregated.  This could still be an add driven business, but subscription, self hosted, and DIY solutions could exist too.

Content hosters could either charge a monthly hosting fee, or they could serve up their own adds.  Self hosted and DIY solutions could also exist.

The big benefit to this would of course be the competition.  Since it's an open standard anyone could be a content host, and anyone could be an aggregator.

To make extra sure there is competition, and this could come in a phase two after the initial splitting up of Facebook, there should be open standards for exporting and importing friends, follows, likes, etc. to and from aggregators, and open standards for importing and exporting content from the hosters.

Speaking of follows and likes, there could also be aggregator aggregators (AAs). People could opt in to publicly and anonymously share their likes and follows and the AAs would consume those and report on trends that cross aggregator boundaries.  Anonymity could be much more protected this way while still giving us that interesting information about what is trending.

One tricky part of this is how do I as a content author only allow my friends to see certain posts of mine?  It would have to be with encryption. My content provider could keep public keys of my friends and only my friends (well, their aggregators) would be able to decrypt my posts using my friends' private keys.  I can see some challenges and holes in this, but it doesn't seem any worse overall than how Facebook protects privacy now. Open implementations and peer review could get us to better-than-Facebook privacy quickly.

Facebook would ideally recognize their stagnation and initiate this split themselves. We as their user base can and should help them understand the importance of this. Hopefully it doesn't have to come down to government enforcement of anti-trust laws, but that could be a useful tool to apply here as well.

Monday, March 13, 2017

Quick Thoughts on Creating Coding Standards

Introduction

No team says, "write your code however the heck you want." Unless you are coding alone, it generally helps to have an agreed upon coding standard. Agreeing upon a coding standard, however, can be a painful process full of heated arguments and hurt feelings. This morning I thought it might be useful to first categorize coding standard items before starting the arguments. My hope is that once we categorize coding standard items we can use better decision criteria for each category of items and cut down on arguing. Below are the categories I came up with really quickly with descriptions, examples, and decision criteria for each category. Feedback is welcome in the comments.

Categories of Things in Coding Standards

Language Specific Pitfalls

Characteristics​

  • not subjective, easy to recognize pattern
  • well recognized in the industry as dangerous
  • people have war stories and about these with associated scars to prove it

Examples

  • no multiple declarations on one line in C
  • Cliff Cummings rules for blocking vs. non-blocking assignments in Verilog
  • no willy nilly gotos in C
  • no omitting braces for one liner blocks (or begin-end in Verilog)
  • no compiler warnings allowed

How to resolve disputes on which of these should be in The Coding Standard?

Defer to engineers with best war stories. If nobody has a war story for one, you can probably omit it (or can you?).

General Readability/Maintainability

"Any fool can write code that a computer can understand. Good programmers write code that humans can understand." –Martin Fowler

Characteristics

  • things that help humans quickly read, understand, and safely modify code
  • usually not language specific
  • the path from these items to bugs is probably not as clear as with the above items, but a path does exist

Examples

  • no magic numbers
  • no single letter variable names
  • keep functions short
  • indicators in names (_t for typedef's, p for pointers, etc.)

How to resolve disputes on which of these should be in The Coding Standard?

If someone says, "this really helps me" then the team should suck it up and do it. This is essentially the "put the slowest hiker at the front of the group" principle.


Alternatively these can be discussed on a case by case basis during code reviews instead of being codified in The Coding Standard. Be prepared for more "lively" code reviews if you go this route.

Code Formatting

The biggest wars often erupt over these because they are so subjective. This doesn't have to be the case.

Characteristics

  • these probably aren't really preventing any bugs
  • most can easily be automatically corrected
  • are largely a matter of taste
  • only important for consistency (which is important!)

Examples

  • amount of indent
  • brace style
  • camelCase vs. underscore_names
  • 80 column rule
  • dare I even mention it? tabs vs. spaces

How to resolve disputes on which of these should be in The Coding Standard?

Don't spend a long time arguing about these. Because they are so subjective and not likely to cause or reduce bugs one way or the other, nobody should get bent out of shape if their preference is not chosen by the team. Give everyone two minutes to make their case for their favorite, have a vote, majority wins, end of discussion. Use an existing tool (astyle, autopep8, an emacs mode, whatever is available for the language) to help people follow these rules.