Microsoft Research Projects

http://research.microsoft.com/en-us/projects/mslr/download.aspx

We release two large scale datasets for research on learning to rank: MSLR-WEB30k with more than 30,000 queries and a random sampling of it MSLR-WEB10K with 10,000 queries.

Below are two rows from MSLR-WEB10K dataset:
=============================================================
0 qid:1 1:3 2:0 3:2 4:2 … 135:0 136:0 
2 qid:1 1:3 2:3 3:0 4:0 … 135:0 136:0 
=============================================================

In the data files, each row corresponds to a query-url pair.

The first column is relevance label of the pair, the second column is query id, and the following columns are features. The larger value the relevance label has, the more relevant the query-url pair is. A query-url pair is represented by a 136-dimensional feature vector. 

The details of features can be found here.