• 欢迎来到THBWiki!如果您是第一次来到这里,请点击右上角注册一个帐户
  • 有任何意见、建议、求助、反馈都可以在 讨论板 提出
  • THBWiki以专业性和准确性为目标,如果你发现了任何确定的错误或疏漏,可在登录后直接进行改正

用户Wiki:Demobanker/internet research tips

来自THBWiki
跳到导航 跳到搜索

I've built up a collection of internet researching knowledge over the years of adding fanworks to the wiki. I hope these can help you save time too.

General Link Searching Tips

Bookmark 特殊:链接搜索! You can enter a URL and find all the places where it or any subpages are linked on the wiki. I use it to check if someone (for example, an illustrator) is linked to an existing circle page on the wiki. It's helpful because I can then link to that person's circle, which makes it easy to check other works that they're involved in. It also reduces the number of duplicated links, which can be time consuming to update if the link is changed or is defunct.

But there are some limits to what kind of URLs the page can find. Some links that don't appear include these:

  • Youtube, Soundcloud etc. links used in previews - these use site-specific ids and not urls, which is why they don't appear.
  • Interwiki links such as [[pu:abcde]] for Pixiv user links or [[p:abcde]] for Pixiv artwork links. I didn't realize this for a long time, so I've stopped using these types of links for Pixiv. Feel free to change them to external links if you come across them!
  • Archive pages in the Wayback Machine. This is because the search only looks for the start of URLs.

There are also two less obvious situations where a link might not appear in a search:

  • Links using http:// don't appear when searching for https:// and vice versa
  • Different letter cases in the URL after the domain. The search is case sensitive in those parts, which means example.com/Test doesn't appear when searching for example.com/TEST
    • However, the search isn't case sensitive in the domain name, so searching for Twitter.com/test or TWitteR.com/test is the same as searching for twitter.com/test

Images

Cool tips on getting high quality images.

Image Max URL

Image Max URL can help with getting the highest or original quality images on various websites. There's even an userscript and Firefox addon to automatically redirect you to the best quality image. It's most useful on Melonbooks where it's not obvious how to get the URL for the best quality image. Some websites don't delete images even if the page that shows the image is unavailable (for example, if someone closes their Booth shop or deletes a track from Soundcloud). In those cases you can still get better quality images if you have the URL to the normal quality image. Incidentally, this is why I usually add a direct image link to my uploads. With that said, Image Max URL doesn't work on Pixiv and Nico Seiga without an account.

See SoundCloud and Steam sections for more useful image tips.

SauceNao

SauceNao is a great reverse image search site for the purposes of this wiki. It is constantly updated with new data from Pixiv and Nico Seiga (and Twitter sometimes), and links to reuploader image board services including Danbooru, Gelbooru and Yande.re if the images are uploaded there. This is quite useful if you need the artist credit to cover arts.

What's more, combined with Image Max URL and Internet Archive, you can find high quality versions of cover arts from something like a scan on Surugaya, but do make sure the art you find is what's actually used on the release (after cropping if necessary). No use having a high quality image if it isn't accurate.

ascii2d

二次元画像詳細検索 (or ascii2d.net) is a Japanese reverse image search engine that covers the same area as SauceNao. It catches images that SauceNao doesn't and is a good complement to that site.

RIOT

This one isn't about research, but it is about images, and I don't want to make a new userpage just for this one recommendation.

RIOT (or Radical Image Optimization Tool) is a freeware image optimizer that can reduce image filesizes by changing how it's compressed. I highly recommend running this on any PNGs before you upload using these settings:

  • Color reduction: True Color - every other setting removes color information from the original image, making the output look different from the original image
  • Compression: Maximum (very slow) - for reference, it'll take about 10 seconds to optimize a 600x600px image and around a minute to optimize a 3000x3000px image

Also this should be obvious but don't upload the image if this process actually increases the file size.

I don't recommend using this to optimize JPGs into JPGs as that's a conversion from a lossy to lossy format (even 100 quality isn't a lossless compression), but if you do, use these settings:

  • Chroma subsampling: None (4:4:4) - again, changing this makes the output look different from the original image, though the changes here are more subtle
  • Encoding: Progressive - makes the output file size smaller, and makes it so that the image gradually loads as the viewer's browser receives the image data. Setting it to Standard optimized actually gives it a larger filesize, and makes it not load on browsers until all the data has been received.

Then change the quality value until you find one that has a good size and doesn't look full of JPG compression artiacts.

Twitter

You gotta get an account for this. No workarounds now.

The most important thing you need to learn is the advanced search, or search filters. For example, add "filter:images" to get only images, "until:yyyy-mm-dd" to filter by time, etc. Knowing how doujin artists make announcements is also important. Circle members will usually post a catalog of what the circle will bring to an event, which you can find by searching for "お品書き OR おしながき".

Saw a collab announcement on Twitter, but the organizer didn't link to any of the participants? Get the tweet ID from the URL (it's the numbers at the end) and search for "quoted_tweet_id:[tweet id]". That will show public quote retweets of the announcement tweet, most of which will be made by the participating artists.

SoundCloud

SoundCloud retains artwork data even when the track that goes with it has been deleted. If you have an archived version of a track page, or an archived user page that shows the deleted track, you can get the URL of the artwork, remove the web archive part and plug it in Image Max URL to get the original quality artwork. This works even if the archived page doesn't have the artwork saved - all you need is the URL. In theory this should also work on SoundCloud embeds on archived pages, but I haven't found an example to test.

Booth

Similar to above, the site retains artwork data of deleted items. If you have an archived page and can get the URL to the cover art, even if it's a thumbnail, you can get the full version of it.

Youtube

Youtube's search filters, while useful, still leaves a lot to be desired. For example, you can't use these filters to only show videos from before 2015. However, like with Google there are search operators you can use to fine-tune results.

  • +[search term] requires the term to appear in results.
  • -[search term] excludes the term from results. Make sure to put a space before the -, and no space after, so the search engine know that you want to exclude the term and isn't trying to search for something like a hyphenated name.
  • before:[year] and after:[year] filters by year ranges. You can also use a date instead of a year for more precision.
  • intitle:[search term] limits the search area to video titles.

These also work with the search on user/channel pages.


Tired of Youtube's bloated memory hog of a website? Use the Piped frontend. You can enable a "watch on youtube" button to get the Youtube link. As a bonus you'll also be free of tracking and pesky anti-adblock popups.

Steam

So far I've found SteamDB to be the best way to get info on Steam applications. It's a third-party website that gets information using Steam's APIs and through parsing store pages.

One benefit of using this site is that you can get a game's library capsule image, which is the portrait/poster-like picture that's shown in Steam library. I prioritize this when adding covers for Steam games because you only see it when you own the game, which makes it more representative than the store page header. However, not all games will have a library capsule image. To check if it exists, go to the library_assets section and see if it has an entry for library_capsule. If it doesn't, use the header image, which is in the header_image section.

Pixiv

The old URL format for artworks is

http://www.pixiv.net/member_illust.php?mode=medium&illust_id=[artwork_id]

for single images or

http://www.pixiv.net/member_illust.php?mode=manga&illust_id=[artwork_id]

for multiple images. This can be useful when checking saved pages in the Wayback Machine.

The site haven't changed the URL format for novels, but here it is for completeness:

https://www.pixiv.net/novel/show.php?id=[novel_id]

Privacy note on getting credits from distributor-based music services

Not the best way to title this, but it involves music services whose data are mostly filled in by music distributors like Distrokid and RouteNote. Examples include Spotify, Tidal, Apple Music and Youtube music (including auto-generated Youtube topic videos). These services allow you to see song credits like songwriting, performance and production through a "show credits" option (Spotify), through the file metadata (Apple Music downloads), or in the case of Youtube topic videos, in the video description.

This might sound like a good thing for getting information, but it can cause privacy issues, and here's why.

Because the majority of the data on these services are handled by distributors, and because distributors are designed so artists don't have to spend hours figuring out how music services work, there are cases where artists' real names were displayed against their wishes because they gave it to distributors.

The good thing is, as terribly managed as TDMD is, their laziness actually saved them in this case as they only put the bare minimum credits on Spotify and other places. But not all Touhou music on these services go through TDMD, so this section still applies.

This doesn't apply to places where the artists is directly filling in the data, like Booth, Bandcamp, Soundcloud and personal Youtube channels.

links for old stuff

Useful for researching old stuff, mostly from the personal website era (so pre-2010).


general info aggregators


pages for music


pages for games

  • 制作のしおり - Already mentioned in the general section but they do have a lot of games coverage compared to the others. I've used this to source for many fangames' release dates.
  • Operation Jaguar. - Blog about Touhou games. Active from 2009 to 2014

doujinshi


other links


Since doujinshi.org doesn't seem to be coming back (seriously, Tenetan, at least give people some closure!), these doujinshi databases might be worth looking into: (Source)


Other links I came across that might be useful?