RESTful API design best practices

For this new project I am currently involved in, I need to design and criticize some APIs. So I thought maybe digging up a little bit about RESTful API might be helpful for me.

When we are talking about API, we know that we are talking about Abstract Programming Interfaces, indeed but what makes a API good and bad? People having this religious debates about SOAP and REST, but U want to know, what should we keep in mind when we are designing a RESTful API.

First of we need to be clear about our goals why and what for we are creating this APIs. We create APIs for another developer so that they can offer good stuff to their customers. But the goal of api is to scale, so that our product can reach more people.
Why do we need to go for REST? Well, Rest is not a standard but a way of doing something that makes life easier. It has this following features:

i) Scalability: It has theability to make it grow faster. Say we want to add a new feature in our api, we can add it without much effort.
ii) Generality: It has to use a general protocol, like http, or https.
iii) Independence: Another good thing about REST is that, different part of apis can be independent in this architecture.
iv) Latency: caching is another big feature.
v) Security: as REST use http type protocols, it also has some good feature of security, we can use http headers for security purpose.
vi) Encapsulation: Rest architecture only shows thing that could possibly be necessary to other.

Why json?
Ubiquity: well, 50-70% of the market uses JSON.
Simplicity: It has a simpler grammar than any other competitors.
Readability: We don’t need any other prior knowledge to read jsons.
Scalability: Jsons are easy to add to.
Flexibility: It is flexible, everything can fit in json.
HATEOAS: it does not contain content type or anything, it only contains things that is necessary.

REST is easy for user but it can be hard for those who makes it. But one thing we need to keep it in mind, it is not a standard like w3 but it is a style, it is an architecture. This is easy to maintain if we keep few things in mind.

Suppose we have this application that creates user, gets user derails, change user details and so on. So one noob approach would be:

/createUser
/getUserDetail
/changePassword
…and so on.

Is it good? it is not. Not only it is bad, it also confuses the user of our API. We should put only nouns in our API request url not behaviors like doThis. We can do that with less urls to request. In fact we can do that with one using http methods, GET, PUT, POST, DElete, head, (PATCH). CRUD is not necessarily a one to one correspondent of GET, PUT, POST, DELETE. In fact PUT and POST can be used to get, create and update. Before discussing about POST or PUT, I think we can go for something simple, like getting information. I think there can be 2 types of resources need to be care about, we can get whole bunch of users, or a single instance of users.
i) collection
/users <- it is a plural
ii)instance
/users/1231 <-nested

Now lets talk about PUT. PUT has this little complicacy that, PUT is idempotent, it sends all the same state remains same does not matter if we get a 101 or what as a result. So need to take care sometime. We need to present all the field present when we do PUT. With PUT can’t support partial update. As PUT returns some data in the end a blank PUT can be used similarly as GET if implemented like that.

POST always done on parent (eg: /users) for POST 200 response is not good enough to return, as when the request is successful it also returns 200, it can be 201. but for update 200 is ok. POST is the only method not idempotent. We can make it idempotent. Some rest api has data limit, we can reduce it using partial updates. Also has impact on performence.

Now it is always a good idea to deal with Accept header of a request and send priority list which it can accept. When we send a request, it is also important to have a content-type.  We can do it easily if we use it as an extention.

applications/aln2c3.json.

BASE URLs does not mean a lot in rest api. https://api.foo.com is better looking than http://www.foo.com/dev/service/api/rest .Version should be in url, https://api.foo.com/v1, or as media type application/json;application&v=1, for those who changes api very frequently media type is better option. But it requires more work to make it done. not ideal and it is not completely fall under rest api system. About versioning, 1.2.3.4 should not be at api version of data exchange.

It is always better to use Camel cases not underscore when it comes to json. Date should be at iso 8601 and it should be UTC rather than anything else, it saves a lot of work.

There should always have ids but does the user need to know about this ids? Nope, we can use HREF in json media, instead. Hypermedia is paramount, does not have xlink like xml.

Now we talked about the fundamentals, but at the same time giving too much information on each api calls are a bad idea because it costs you money. So what can you do? You can ask user to request, what do you want? and upon his request you can extend your api response.

?expand=resource1, resource2

 

It is being considered as different resource so it will be stored in different cache. Offset and limits can also be done the same way.

?offset=5-&limit=25

It is always our dream that there won’t be any errors anywhere, but practically it never happens, but error can be tackled. So it is always better to keep user updated about the errors. Rather than just let them check the headers for error messages we should tell them in json response status about it. we should describe that error to him, give them a link to get notified about the error updates. Thats how we can help them.

About security, I would say sessions does not do much good other than over heads, sessions can be copied and you won’t see it coming, not only that session requires extra clustering at your server, it also better not to have them when you need better performance. Url changes all the time, so it is better not to hash urls, rather than urls rather use it on content.

There are also some people that believe oAuth2 better than oAuth1a, but honestly speaking oauth is not more secure than 2, oAuth2 wanted to make life easier though some people really doubt about its success. The use bearer token can be questioned.

It is always better to use api keys, rather than user passwords.
ENtropy: long in length, you can’t guess like a password.
password reset: does not affect on password change.
independence: not broken on password change or anything.
speed: If we use bcript  it may take 0.1 or 0.5 s, user may never notice, but some hacker will notice that it will take time.
limited exposure: only who downloaded it, he knows
traceablity: we can trace why it did not work.

Now for login there is this common misconception on when to use 401 and 402.

401 unauthorized means unauthenticated, maybe password was unchecked.
402 forbidden, credentials taken but found out he is not allowed

It is always good idea to add this POST method, because some user may never heard of DELETE kind of methods. And why post? Because POST is  the only method that is not idempoten.

?_method=DELETE

How to automate numpy installation in your project using setuptool?

Recently for an opensurce project, I had to work using numpy. Numpy makes life a lot easier, without any doubt, but when it comes to setup.py of setuptool. It does not work just by adding <code>’install_requires:[“numpy”]’ </code> mainly because unlike any other python packages numpy is written in C/C++, mainly because of the optimization, interpreted language like python are not fast enough for mathematical computations. They ported their code in python. So before using them we need to compile them.

To do that we need to use custom commands, for custom commands in setuptool, we need to add “cmdclass” as a key in the dictionary. “build_ext” is a cython command which helps to compile files of numpy. But before compiling is done we should not call anything else, so we need to customize some existing class. So we overwrite build_ext class with an extension of it.

So the code should look like this:

 

from setuptools import setup
from setuptools.command.build_ext import build_ext as _build_ext

class build_ext(_build_ext):
    'to install numpy'
    def finalize_options(self):
        _build_ext.finalize_options(self)
        # Prevent numpy from thinking it is still in its setup process:
        __builtins__.__NUMPY_SETUP__ = False
        import numpy
        self.include_dirs.append(numpy.get_include())

config = {
 'cmdclass':{'build_ext':build_ext}, #numpy hack
 'setup_requires':['numpy'],         #numpy hack
 #...
 'install_requires': ["numpy" ]
}

 

Now why do we write __NUMPY_SETUP__=false? I have found an interesting answer @ StackOverFlow by coldfix.

In the source code of numpy we have got this:

 

if __NUMPY_SETUP__:
    import sys as _sys
    _sys.stderr.write('Running from numpy source directory.\n')
    del _sys
else:
    # import submodules etc. (main branch)

So when NUMPY_SETUP is true it does not import submodules so we need to make it sure that it does not get called first time. Then we include it in our library path.

 

 

Why Android app DOES NOT run on Java Virtual Machine!

I had a misconception for like years that as we write code in java for android application so there must be an JVM [just like on J2ME mobiles ] in underline android Operating System but actually I was actually wrong. So I have written a little bit on the differences.
Its true that for most of the android development we use java and we need to compile it like other java development. When in our IDE we hit compile button it also generates Java byte-code files which will be better understood if I say .class files. But later on these .class files get compressed and get converted into a single byte code file using DX converter which usually known as classes.dex. Then this dex filles packaged with other application resources and become a .apk file. We can install this .apk file in the devices. When we run this application, JRE does not run the .dex file instead Dalvik Virtual Machine executes that dex bit codes.
Since JVM and Delvik are turing complete, so basically end of the day they do the same things. But it has internal differences. The key difference is that JVM is stack-based, while Dalvik is register based.
I think it will be better if we remind the example of stack based addition and register based addition.
In an stack stack based addition when we need to add a set of numbers we put all of them into a stack then we would pop top two and add them up then again we would push that summation in the stack and we will pop two and sum them up and we will repeat this process until the stack is empty. So it takes minimal process to implement stack based VM because only 2 register can do it. Which is very important for JVM because it supports a wide range of devices.

On the other hand register based system it requires to tell the explicit memory location in the memory if we want it to add. But the average register instruction is larger than an average stack instruction. once we tell them the location explicitly we don’t need to go through all this over head of push, pop which helps android to run on slow CPU and less RAM. Additionally in register based virtual machine few optimization can be done which on low battery consumption. DVM has been designed so that a device can run multiple instances of the VM efficiently.