GSA is a Google Search Appliance – a piece of hardware that implements Google search in your organization.
GSA is one of the most brilliant services that an organization could use: you plug this box to the network, define which servers / services to index, and you got a search with full Google capabilities in your organization. There are lots of advantages to use GSA as a search service:
- Good price: It is fairly cheap in comparison to all other products and services
- Highly reliable - I never heard of any issues or support calls in regard to it (in contrast to other IT services which usually consume hours and days of support)
- Google search – is still, unbeatable. The Google ranking method is the best still, even that other search services (like Bing and many small ones) are rapidly improving and may beat Google in the future.
- KISS (Keep It Simple Stupid) – the administration GUI is simple to use, I would say straightforward. I don’t see any reason for en experienced IT support person to use it without any training
- Fast – implementation is fast… it is as fast as your ability to design a webpage that calls the GSA API and returns the results (I would say – if it takes you more than a week to install, check carefully if your IT support vendor is the right one)
- Customization - great ability of customization and additional services.
- Format safe: the GSA (like Google search) will index all formats – HTML, PDF, DOC, PPT…. the list is endless.
However, using a GSA considers two major high risks that unfortunately many government organizations in New Zealand and commercial companies do not consider. I saw the GSA entering many agencies and government organizations in Wellington, and in my experience I implemented it in two major Government organizations (major = in the top 10 list of New Zealand Databases).
2 Major risks in using GSA: Privacy, and Security.
1) Privacy act and the GSA:
the GSA is a black box – it is closed and the organization is not able to access the hardware or the physical storage in the box. It is almost like having “the cloud” but inside your network. Indeed you can define what SHOULD be indexed, but it does not mean that this is what is indexed in practice, but just what is searchable by the API. According to the privacy act,
a person can (at any time) ask an organization to browse / remove all the private information collected on him/her: could you do that AFTER you indexed your information with a GSA? I believe that one day, a random court case to will challenge all the Government organizations that plugged a GSA to the network without considering or managing this risk.
2) Security: Google indeed have the best developers on this planet, and they are creating the most reliable and secured applications ever. However – we are letting the cat guard the cheese when we use the Google Inc to index our internal content: Google is company that makes its income from selling information: advertisements, data mining, data intelligence – these are the main targets of the Google Inc. Maybe when you implement a GSA, Google is not selling your information directly to third parties, but I doubt they are not using it at all for selling statistical information (like – how many households in NZ have a heat pump – lets not forget the Google planning to resell electricity – this stats could help them!).
The GSA is a black box which (as we said) is indexing content, i.e. replicating (copying) this content to the GSA’s database. So when you implement it, the GSA contact includes the following:
- The box is closed
- Only the headquarter of Google in California are allowed to access or connect to the box.
- In order to receive support, the client is required to open a firewall port allocated for Google California.
So basically what it means is that Google are using the local vendors which are implementing the box, only as a re-sellers. Even Google Asia and Pacific which headquartered in Sydney, don’t have an access to a GSA box installed in Wellington, New Zealand. Now, you need to add 1+1 to asses the risk:
1. Is your GSA indexing secured information?
1. (the other 1) – The box could not be accessed outside your network, EXCEPT by Google California.
The meaning of it is that if you handle secured information, all your secured information could be accessed by Google Inc. This should be a major factor in making the decision if to implement a GSA or not, either by Government organizations or by commercial companies which own secured information.