資料集檢視器文件
預覽資料集
加入 Hugging Face 社群
並獲得增強的文件體驗
開始使用
預覽資料集
資料集檢視器提供了一個 ` /first-rows ` 端點,用於視覺化資料集的前 100 行。這將讓您很好地瞭解資料集中包含的資料型別和示例資料。
本指南向您展示如何使用資料集檢視器的 ` /first-rows ` 端點來預覽資料集。也歡迎使用 Postman、RapidAPI 或 ReDoc 進行嘗試。
` /first-rows ` 端點接受三個查詢引數
dataset
:資料集名稱,例如nyu-mll/glue
或mozilla-foundation/common_voice_10_0
config
:子集名稱,例如cola
split
:分片名稱,例如train
Python
JavaScript
cURL
import requests
headers = {"Authorization": f"Bearer {API_TOKEN}"}
API_URL = "https://datasets-server.huggingface.co/first-rows?dataset=ibm/duorc&config=SelfRC&split=train"
def query():
response = requests.get(API_URL, headers=headers)
return response.json()
data = query()
端點響應是一個 JSON,包含兩個鍵
- 資料集的
features
,包括列名和資料型別。 - 資料集的前 100 行 ` rows ` 以及特定行中每列包含的內容。
例如,以下是 ` ibm/duorc ` / ` SelfRC ` 訓練拆分的 ` features ` 和前 100 行 ` rows `
{
"dataset": "ibm/duorc",
"config": "SelfRC",
"split": "train",
"features": [
{
"feature_idx": 0,
"name": "plot_id",
"type": { "dtype": "string", "_type": "Value" }
},
{
"feature_idx": 1,
"name": "plot",
"type": { "dtype": "string", "_type": "Value" }
},
{
"feature_idx": 2,
"name": "title",
"type": { "dtype": "string", "_type": "Value" }
},
{
"feature_idx": 3,
"name": "question_id",
"type": { "dtype": "string", "_type": "Value" }
},
{
"feature_idx": 4,
"name": "question",
"type": { "dtype": "string", "_type": "Value" }
},
{
"feature_idx": 5,
"name": "answers",
"type": {
"feature": { "dtype": "string", "_type": "Value" },
"_type": "List"
}
},
{
"feature_idx": 6,
"name": "no_answer",
"type": { "dtype": "bool", "_type": "Value" }
}
],
"rows": [
{
"row_idx": 0,
"row": {
"plot_id": "/m/03vyhn",
"plot": "200 years in the future, Mars has been colonized by a high-tech company.\nMelanie Ballard (Natasha Henstridge) arrives by train to a Mars mining camp which has cut all communication links with the company headquarters. She's not alone, as she is with a group of fellow police officers. They find the mining camp deserted except for a person in the prison, Desolation Williams (Ice Cube), who seems to laugh about them because they are all going to die. They were supposed to take Desolation to headquarters, but decide to explore first to find out what happened.They find a man inside an encapsulated mining car, who tells them not to open it. However, they do and he tries to kill them. One of the cops witnesses strange men with deep scarred and heavily tattooed faces killing the remaining survivors. The cops realise they need to leave the place fast.Desolation explains that the miners opened a kind of Martian construction in the soil which unleashed red dust. Those who breathed that dust became violent psychopaths who started to build weapons and kill the uninfected. They changed genetically, becoming distorted but much stronger.The cops and Desolation leave the prison with difficulty, and devise a plan to kill all the genetically modified ex-miners on the way out. However, the plan goes awry, and only Melanie and Desolation reach headquarters alive. Melanie realises that her bosses won't ever believe her. However, the red dust eventually arrives to headquarters, and Melanie and Desolation need to fight once again.",
"title": "Ghosts of Mars",
"question_id": "b440de7d-9c3f-841c-eaec-a14bdff950d1",
"question": "How did the police arrive at the Mars mining camp?",
"answers": ["They arrived by train."],
"no_answer": false
},
"truncated_cells": []
},
{
"row_idx": 1,
"row": {
"plot_id": "/m/03vyhn",
"plot": "200 years in the future, Mars has been colonized by a high-tech company.\nMelanie Ballard (Natasha Henstridge) arrives by train to a Mars mining camp which has cut all communication links with the company headquarters. She's not alone, as she is with a group of fellow police officers. They find the mining camp deserted except for a person in the prison, Desolation Williams (Ice Cube), who seems to laugh about them because they are all going to die. They were supposed to take Desolation to headquarters, but decide to explore first to find out what happened.They find a man inside an encapsulated mining car, who tells them not to open it. However, they do and he tries to kill them. One of the cops witnesses strange men with deep scarred and heavily tattooed faces killing the remaining survivors. The cops realise they need to leave the place fast.Desolation explains that the miners opened a kind of Martian construction in the soil which unleashed red dust. Those who breathed that dust became violent psychopaths who started to build weapons and kill the uninfected. They changed genetically, becoming distorted but much stronger.The cops and Desolation leave the prison with difficulty, and devise a plan to kill all the genetically modified ex-miners on the way out. However, the plan goes awry, and only Melanie and Desolation reach headquarters alive. Melanie realises that her bosses won't ever believe her. However, the red dust eventually arrives to headquarters, and Melanie and Desolation need to fight once again.",
"title": "Ghosts of Mars",
"question_id": "a9f95c0d-121f-3ca9-1595-d497dc8bc56c",
"question": "Who has colonized Mars 200 years in the future?",
"answers": [
"A high-tech company has colonized Mars 200 years in the future."
],
"no_answer": false
},
"truncated_cells": []
}
...
],
"truncated": false
}
截斷響應
對於某些資料集,來自 ` /first-rows ` 的響應大小可能超過 1MB,在這種情況下,響應將被截斷,直到大小低於 1MB。這意味著您可能無法在響應中獲得 100 行,因為行已被截斷,在這種情況下,` truncated ` 欄位將為 ` true ` 。
在某些情況下,即使前幾行生成了超過 1MB 的響應,一些列也會被截斷並轉換為字串。您將在 ` truncated_cells ` 欄位中看到這些列表。
例如, ` GEM/SciDuet ` 資料集只返回 10 行,並且 ` paper_abstract ` 、 ` paper_content ` 、 ` paper_headers ` 、 ` slide_content_text ` 和 ` target ` 列被截斷
...
"rows": [
{
{
"row_idx":8,
"row":{
"gem_id":"GEM-SciDuet-train-1#paper-954#slide-8",
"paper_id":"954",
"paper_title":"Incremental Syntactic Language Models for Phrase-based Translation",
"paper_abstract":"\"This paper describes a novel technique for incorporating syntactic knowledge into phrasebased machi",
"paper_content":"{\"paper_content_id\":[0,1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,23,24,25,26,27,28,29",
"paper_headers":"{\"paper_header_number\":[\"1\",\"2\",\"3\",\"3.1\",\"3.3\",\"4\",\"4.1\",\"6\",\"7\"],\"paper_header_content\":[\"Introduc",
"slide_id":"GEM-SciDuet-train-1#paper-954#slide-8",
"slide_title":"Does an Incremental Syntactic LM Help Translation",
"slide_content_text":"\"but will it make my BLEU score go up?\\nMotivation Syntactic LM Decoder Integration Questions?\\nMose",
"target":"\"but will it make my BLEU score go up?\\nMotivation Syntactic LM Decoder Integration Questions?\\nMose",
"references":[]
},
"truncated_cells":[
"paper_abstract",
"paper_content",
"paper_headers",
"slide_content_text",
"target"
]
},
{
"row_idx":9,
"row":{
"gem_id":"GEM-SciDuet-train-1#paper-954#slide-9",
"paper_id":"954",
"paper_title":"Incremental Syntactic Language Models for Phrase-based Translation",
"paper_abstract":"\"This paper describes a novel technique for incorporating syntactic knowledge into phrasebased machi",
"paper_content":"{\"paper_content_id\":[0,1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,23,24,25,26,27,28,29",
"paper_headers":"{\"paper_header_number\":[\"1\",\"2\",\"3\",\"3.1\",\"3.3\",\"4\",\"4.1\",\"6\",\"7\"],\"paper_header_content\":[\"Introduc",
"slide_id":"GEM-SciDuet-train-1#paper-954#slide-9",
"slide_title":"Perplexity Results",
"slide_content_text":"\"Language models trained on WSJ Treebank corpus\\nMotivation Syntactic LM Decoder Integration Questio",
"target":"\"Language models trained on WSJ Treebank corpus\\nMotivation Syntactic LM Decoder Integration Questio",
"references":[
]
},
"truncated_cells":[
"paper_abstract",
"paper_content",
"paper_headers",
"slide_content_text",
"target"
]
}
"truncated_cells": ["target", "feat_dynamic_real"]
},
...
],
truncated: true