antiseed
get_aug_antiseed(primary_key, tech_class, pc_list, size, table_ref, destination_table, credentials, write_disposition='WRITE_APPEND', verbose=False)
¶
Deprecated
The augmented antiseed is not used any more. We have "close" negative examples from human annotations.
Return the augmented anti-seed
Parameters:
Name | Type | Description | Default |
---|---|---|---|
primary_key |
PrimaryKey |
table primary key |
required |
tech_class |
TechClass |
technological class considered |
required |
pc_list |
str |
list of technological classes to draw from, comma-separated (e.g. A,B,C) |
required |
size |
int |
size of the antiseed |
required |
table_ref |
str |
expansion table (project.dataset.table) |
required |
destination_table |
str |
query results destination table (project.dataset.table) |
required |
credentials |
Path |
credentials file path |
required |
write_disposition |
str |
BQ write disposition |
'WRITE_APPEND' |
verbose |
bool |
verbosity |
False |
Usage:
techlandscape antiseed get-af-antiseed family_id cpc Y12,Y10 300 <expansion-table> <destination-table> credentials_bq.json
Source code in techlandscape/antiseed.py
@app.command(deprecated=True)
def get_aug_antiseed(
primary_key: PrimaryKey,
tech_class: TechClass,
pc_list: str,
size: int,
table_ref: str,
destination_table: str,
credentials: Path,
write_disposition: str = "WRITE_APPEND",
verbose: bool = False,
) -> None:
"""
!!! warning "Deprecated"
The augmented antiseed is not used any more. We have "close" negative examples from human annotations.
Return the augmented anti-seed
Arguments:
primary_key: table primary key
tech_class: technological class considered
pc_list: list of technological classes to draw from, comma-separated (e.g. A,B,C)
size: size of the antiseed
table_ref: expansion table (project.dataset.table)
destination_table: query results destination table (project.dataset.table)
credentials: credentials file path
write_disposition: BQ write disposition
verbose: verbosity
**Usage:**
```shell
techlandscape antiseed get-af-antiseed family_id cpc Y12,Y10 300 <expansion-table> <destination-table> credentials_bq.json
```
"""
project_id = get_project_id(primary_key, credentials)
pc_list = pc_list.split(",")
pc_like_clause_ = get_pc_like_clause(tech_class, pc_list, sub_group=True)
country_prefix = get_country_prefix(primary_key)
query = f"""
SELECT
DISTINCT(r.{primary_key.value}) AS {primary_key.value},
"ANTISEED-AUG" AS expansion_level
FROM
`{project_id}.patents.publications` AS r,
UNNEST({tech_class.value}) AS {tech_class.value} {country_prefix}
LEFT OUTER JOIN
{table_ref} AS tmp
ON
r.{primary_key.value} = tmp.{primary_key.value}
WHERE
{pc_like_clause_}
ORDER BY
RAND()
LIMIT
{size}
"""
get_bq_job_done(query, destination_table, credentials, write_disposition, verbose)