Wiktionary talk:Wikidata

Latest comment: 6 years ago by Lea Lacroix (WMDE) in topic Update about lexicographical data on Wikidata

Deployment of arbitrary access edit

Hello! If you're fine with this, I'll keep communicating about this specific project on this page, so we can coordinate about details. Of course, for big announcements, I'll still reach the community via the Beer Parlour :)

Reminder: once it is activated, the "arbitrary access" will allow you to call and display any kind of Wikidata data on English Wiktionary: labels, descriptions, values of the statements... This will allow you to create new tools and templates reusing the structured data that is already in Wikidata. For example, we mentioned a nice example to start with: in a Citation page, instead of writing by hand the name of a book, its author, date of publication, etc, it will be possible to display directly this information with adding only the ID of the work.

Disclaimer: for now, Wikidata contains only data about concepts, not about words. This will exist in the future with the lexicographical data project, but for now, our goal is not to add lexicographical data in Wikidata. Therefore, you should not create data about words in Wikidata.

Date: After talking with the developers and taking in account a few this in our schedule (Wikimania, summer vacations, etc.) I suggest to set up the day of enabling arbitrary access to Thursday, September 7th.

In the meantime, we should work on the documentation, so the Wiktionary editors understand how it works. I'll come back to you very soon with existing documentation, so we can adapt it to special needs of Wiktionary.

Is that fine for you? If you have any question remaining, feel free to ask! Lea Lacroix (WMDE) (talk) 13:46, 1 August 2017 (UTC)Reply

Next step for lexicographical data: demo system edit

 
Lexicographical data on Wikidata, Lydia Pintscher, Wikimania 2017

Hello all,

During Wikimania, Lydia presented the status of the lexicographical data on Wikidata. You can find the slides here.

We're also happy to announce that there is now a demo system ready, where you can try structured lexicographical data as it will appear on Wikidata. Please note the following:

  • The system is not persistent for now, the information are not stored and will disappear if you reload the pages
  • The structure of the pages is based on the data model, but the content and the properties will be decided by the community in the future. We created a few for the demo, feel free to create others.
  • The design of the page is also expected to change, this is not the final version

Feel free to try it, give us feedback or ask questions. See also the Phabricator board. Thanks for your support! Lea Lacroix (WMDE) (talk) 14:13, 14 August 2017 (UTC)Reply

Working on documentation edit

Hello all,

In preparation for enabling Wikidata data access on Wiktionary (and also because it was needed for other projects), I started this documentation page: d:Wikidata:How to use data on Wikimedia projects. It's on Wikidata now, but we could also have a version on English Wiktionary, with, in the future, the documentation of the specific uses, templates, etc. that you would create using the data.

For now, I'd love to have your feedback on this page. If you don't understand something, if you feel that something is missing, if you have any question that this page should answer to... let me know, and we'll improve it!

Also, if you want to start making tests with #statements property, since you can't yet on English Wiktionary, you can for example try it on a subpage of your userpage on Wikidata. I just tried to model an example on my test page.

What do you think? Lea Lacroix (WMDE) (talk) 15:29, 17 August 2017 (UTC)Reply

Example of using Wikidata for a citation edit

Hello,

On top of the documentation on how to include Wikidata data, I created a rough example on how you could display information from Wikidata on Wiktionary, based on a Moby-Dick citation. This is not completed, some data may be missing in the item describing the edition, and when the parser function is not doing exactly what we expect, we should consider using Lua to make more accurate templates.

Feel free to give feedback or ask any question. As a reminder, the access to Wikidata data will be enabled on English Wiktionary on September 7th. Lea Lacroix (WMDE) (talk) 13:11, 1 September 2017 (UTC)Reply

Arbitrary access is now enabled! edit

Hello,

You can now access Wikidata data from English Wiktionary :)

As a test, I copied my previous example to my userpage on en.wkt. You can make tests on your own! Feel free to add information on this page, when you start using Wikidata data, we would appreciate to know how the data is useful to you, what questions or problems you may have.

You can find documentation about the simple parser function and other methods to reuse data. If some information is missing, feel free to mention me!

Reminder: for now, Wikidata contains only data about concepts, not about words. This will exist in the future with the lexicographical data project, but for now, our goal is not to add lexicographical data in Wikidata. Therefore, you should not create data about words in Wikidata.

Thanks, Lea Lacroix (WMDE) (talk) 15:17, 7 September 2017 (UTC)Reply

Thank you! --Daniel Carrero (talk) 18:36, 7 September 2017 (UTC)Reply
Nice, merci! I'll experiment with it in the sandbox. Daniel, For lua, I've noticed you copied Module:wikidata a while ago. This should now be usable, right? – Jberkel (talk) 18:56, 7 September 2017 (UTC)Reply
There's a lot of code in there that has nothing to do with Wikidata. It would need to be cleaned up first. It also appears that it's not suitable for use from other modules, only from templates. —Rua (mew) 19:38, 7 September 2017 (UTC)Reply
@Lea Lacroix (WMDE) On the page d:Wikidata:How to use data on Wikimedia projects, it mentions arbitrary access in terms of the {{#statements: parser function. Presumably, the wikibase.getEntity() function in Lua is equivalent to this. On the documentation for the latter, it says it's expensive to call if you use it with an id other than that of the current page. Does this also apply to the parser function version? Also, since currently none of our entry pages are connected to any Wikidata item, all uses of Wikidata will be expensive. This is obviously not desirable for large-scale use. Arbitrary access without limit is pretty essential for us to be able to use the data. How can this be remedied? —Rua (mew) 20:04, 7 September 2017 (UTC)Reply
Talking about expensive, let's try to rewrite water with Wikidata queries :) I just wanted to get the item label for Q283 and couldn't find a quick way to do it. Is there a special syntax for it? I agree about cleaning up Module:wikidata, it is quite bloated, the presentation logic could probably be decoupled from the data fetching part. We could start with a thin wikidata access layer and then build on top of that. Jberkel (talk) 20:35, 7 September 2017 (UTC)Reply
We already have a basic library, documented partially at mw:Extension:Wikibase Client/Lua. For access in templates, you'd use {{#statements:(property name)|from=(entity id)}}. For example, with d:Q31 for Belgium: {{#statements:official language|from=Q31}}Dutch, French, German. If there are multiple values, you get a comma-separated list, which is probably not very useful except for directly displaying it on a page. The Lua-based functions are much more useful, though also more complicated to use. —Rua (mew) 20:44, 7 September 2017 (UTC)Reply
Yes, but how do I get the label (which is not a property), without lua? I'm looking for something like {{#label:en|from=Q283}} = water. Parser functions should probably only be used sparingly anyway. – Jberkel (talk) 20:59, 7 September 2017 (UTC)Reply
There's nothing wrong with parser functions, our templates use tons of {{#if:s, {{#ifeq:s and {{#switch:es. It just seems that the Wikidata ones aren't particularly useful for doing more than generating Wikitext output. Doing something like "is this item a chemical element" and then showing different output is much more difficult. With Lua this is pretty easy, see below. —Rua (mew) 21:03, 7 September 2017 (UTC)Reply
You're right, the parser function doesn't allow to display the label (or aliases). It's basically a simple way to display the values of the statements, but without a lot of possibilities to custom it. For more accurate uses, Lua is better.
I'll get more information about the topic of expensive calls and come back to you. Lea Lacroix (WMDE) (talk) 07:23, 8 September 2017 (UTC)Reply
A module with basic functions to start is w:Module:Wikibase. It retrieves labels, descriptions, etc. As for expensive calls, I usually do more than 50 arbitrary calls in Wikipedia infoboxes without any problem. --Vriullop (talk) 08:35, 9 September 2017 (UTC)Reply

A first experiment edit

In the past, I added {{senseid}} with a Wikidata id to a bunch of pages. Now, I've modified Module:senseid so that it transcludes one of two tracking templates, depending on whether the given Wikidata item is a country or not. And it appears to work, compare Special:WhatLinksHere/Template:tracking/senseid/Wikidata/country and Special:WhatLinksHere/Template:tracking/senseid/Wikidata/not_country. A few like Czechoslovakia are considered by Wikidata to be "former country" which is a different id so those get sorted under non-countries at the moment. Some others appear to be errors, like Rwanda, but are actually accurate: they are sorted as countries, but multiple senses have ids and not all of them are of countries. In the case of Rwanda, it's also a language.

A possibility that opens up from this is allowing {{senseid}} (or some new template) to automatically figure out what kind of thing the definition is about, and then placing it in the appropriate categories. For example, modifying {{senseid}} so that it categorises the countries in Category:en:Countries is now trivial. Of course we'd rather categorise them by continent, but the basic idea is there.

(Technical note: What I did here technically breaks the WT:Wikidata policy, since I modified a live and in-use template. However, all it does is add tracking templates, it doesn't functionally change the template in any other way. I think a 5th exception should be added to the list, which is to allow Wikidata to be used in live templates as long as they only affect the addition of tracking templates, tracking categories, or other diagnostic data.) —Rua (mew) 21:00, 7 September 2017 (UTC)Reply

Interesting. Another possible way to link Wiktionary entries with Wikidata would be via the {{pedia}} template, because most Wikipedia articles are associated with a Wikidata item. For example honey links to Honey which is connected to d:Q10987. – Jberkel (talk) 07:51, 8 September 2017 (UTC)Reply
That would definitely be a good idea. We have to decide how to handle disambiguation pages though.
@Daniel Carrero What do you think of my experiment thing? —Rua (mew) 11:42, 8 September 2017 (UTC)Reply
I've now extended the experiment so that entries with senses that are countries also have a tracking template for the continent they are in, e.g. Russia is in Special:WhatLinksHere/Template:tracking/senseid/Wikidata/country/Europe and Special:WhatLinksHere/Template:tracking/senseid/Wikidata/country/Asia. Countries that are not in a continent are in Special:WhatLinksHere/Template:tracking/senseid/Wikidata/country/no continent, which is currently empty. In Module:senseid I have opted to manually connect continent ids to names using a table, rather than using the labels from Wikidata itself. This way, if Wikidata ever changes its labels (which isn't likely), then it won't result in invalid category names on our side. Moreover, some countries use Central America as a continent, but the category we have for Central America is going to be merged into North America. This does show the potential difficulties that may arise in mapping Wikidata information to our categories. —Rua (mew) 13:20, 8 September 2017 (UTC)Reply

I've done some more work and now have the following tracking templates, which can be considered to stand in for existing topical categories:

Any senses that can't be placed in any of these categories are in Special:WhatLinksHere/Template:tracking/senseid/Wikidata/nothing. —Rua (mew) 14:25, 8 September 2017 (UTC)Reply

I like the idea of doing those experiments. While it's true you are technically breaking WT:Wikidata policy, I support adding this 5th exception: "to allow Wikidata to be used in live templates as long as they only affect the addition of tracking templates, tracking categories, or other diagnostic data." I think it's consistent with the spirit of the other exceptions, to allow the possibility to access Wikidata without affecting the actual content presented to readers. --Daniel Carrero (talk) 08:41, 9 September 2017 (UTC)Reply
Thank you. Could you voice your support at WT:Beer parlour/2017/September#Addition to WT:Wikidata policy? This discussion has received no reactions so far.
What do you think of the experiment itself? Would adding this feature (but with real categories of course) to {{senseid}} be good? Should we create a new template instead? If so, should we abandon the idea of Wikidata senseids as unique identifiers for senses and their translation tables? —Rua (mew) 19:23, 9 September 2017 (UTC)Reply
OK, I voiced my support there. I don't mind about the location of the feature, I'm fine with using {{senseid}} or another template/module. I support using Wikidata senseids as senses when possible. Naturally, in case someone is wondering, Wikidata senseids can't be used in all senses, like the green (adjective) vs. green (noun) that we discussed in the past. --Daniel Carrero (talk) 21:54, 9 September 2017 (UTC)Reply
@Daniel Carrero How do you think we should progress from here? I'd like to do a first run with only the planets. They're a small and well-defined set, so they are easy to debug. —Rua (mew) 17:50, 10 September 2017 (UTC)Reply

Wikidata entity link edit

I have created Template:Wikidata entity link. It is widely used in all projects for linking to Wikidata in discussion pages. It is generally used with the shortcut {{Q}} but here it is used for quotations. It shows the label in user language preference site language and its id. Example requested above: {{Wikidata entity link|Q283}} = water (Q283). --Vriullop (talk) 11:11, 9 September 2017 (UTC)Reply

What's the point of the extra redirect (Special:EntityPage)? – Jberkel (talk) 12:48, 9 September 2017 (UTC)Reply
Not really necessary here but for consistency it assures a persistent URI if not used as HTML in a web browser. See d:Wikidata:Data access#Linked Data interface. --Vriullop (talk) 14:11, 9 September 2017 (UTC)Reply

Quick question edit

Are there wikidata items for English Wiktionary words? MHC HumanImmuneSystem (talk) 07:47, 19 October 2017 (UTC)Reply

Quick answer, not yet. --Vriullop (talk) 14:41, 19 October 2017 (UTC)Reply

First feedback round after enabling arbitrary access edit

Hello all,

@Daniel Carrero, Jberkel, CodeCat, Vriullop

The arbitrary access to Wikidata data on English Wiktionary has been enable 6 months ago. I'm very interested to know what you had the opportunity to do with it, which problems you encountered, and so on. I'd really appreciate if you would take the time to answer some (not necessarily all) of these questions:

  • Did you try to use Wikidata's data on English Wiktionary?
  • If yes, for what use case(s)? What were your goals, the things you tried to do?
  • What did work? What did not work? Did anything unexpected happen?
  • Did something block you in your experiment?
  • How do you see this experiment continue? What would you like to do now?

Feel free to add links, screenshots, etc. so I can understand better what you've been working on.

Thanks in advance for your feedback! Lea Lacroix (WMDE) (talk) 13:34, 7 March 2018 (UTC)Reply

@Lea Lacroix (WMDE): We recently added Wikidata ids to our language database (discussion). We use the ids to generate links to Wikipedia articles for languages (you'll see them in the etymology section). It did work as expected, I just had to some data cleanup (merging duplicates etc.) on Wikidata but nothing too complicated. Our language "database" is very memory intensive and adding the Wikidata items caused some memory issues on pages with many translations (not directly related to Wikidata though). A potential solution could be to move more data to Wikidata, but it's not clear what other performance implications that might have (some "expensive" operations are documented, but there are details missing – is it cached etc). Additionally we'll always have Wiktionary-specific data (unofficial language codes etc) that don't really belong on Wikidata, so we'll need some sort of hybrid storage. – Jberkel 11:32, 8 March 2018 (UTC)Reply
Just to add something I forgot: there was also an initiative to use Wikidata ids to disambiguate senses. Unfortunately there's no consensus for this. See the corresponding vote: Wiktionary:Votes/2017-11/Placing Wikidata ID in sense ID of proper nounsJberkel 10:43, 14 March 2018 (UTC)Reply

Update about lexicographical data on Wikidata edit

Hello all,

Here are some short news about the progress on lexicographical data on Wikidata:

  • The name of the portal has been renamed to fit better the scope of the project (lexicographical data can be reused on Wiktionaries, but it's not its only purpose).
  • A list of queries and list of tools collect what people would like to do with the data.
  • The documentation about interwikilinks for Wiktionaries has been improved, and a feedback round about the automatic links system will start next week.
  • A lot of discussions are taking place on the talk page. For example about existing vocabulary on Wikidata, a possibility of vandalism about trademarks, modeling experiments... You're more than welcome to participate to these, the input from Wiktionary editors is very valuable for the project.
  • The deployment of the first experiment of lexicographical data on Wikidata will take place at the end of May. This will include only Lexemes and Forms so far, no Senses yet. The data that the editors will enter will be released under CC0. No automatic import from Wiktionary or any other source will be done by the development team, and we advised the volunteers to refrain from doing massive imports as well. When the interface is deployed, you can try it and suggest some improvements.
  • A process started to define new properties that will be needed for Lexemes and Forms. Feel free to have a look and let comments. If you have questions about the community processes on Wikidata, I'll be happy to answer.

If you have some feedback, feel free to ping me. Lea Lacroix (WMDE) (talk) 12:15, 18 April 2018 (UTC)Reply

Return to the project page "Wikidata".